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ABSTRACT 



Query objects arc created by a client process. The query 
objects have one or more sub-query objects and one or more 
execute methods that are capable of operating on their 
respective query object to produce one or more query 
expressions. All of the execute methods are capable of 
producing the respective query expression that'is compatible 
with a structured query language. A compound query con- 
tains one or more boolean expressions of one or more of the 
query objects. The compound query has one or more com- 
pound execute methods which invoke one or more the 
execute methods of each of the query objects. Each of the 
execute methods returns their respective query expression 
and the compound execute method uses one or more com- 
mon table expressions to combine the query expressions to 
form a single compound query expression that represents the 
boolean expression. This single expression can be execute 
against a database to return a result without executing any of 
the query expressions against the database individually. 

18 Claims, 33 Drawing Sheets 
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SYSTEM AND METHOD FOR 
PERFORMANCE COMPLEX 
HETEROGENEOUS DATABASE QUERIES 
USING A SINGLE SQL EXPRESSION 

This application claims benefit to Provisional Applica- 
tion No. 60/108,754 filed Nov. 17, 1998. 

FIELD OF THE INVENTION 

This invention relates to computer database searching. 
More specifically, this invention relates to the formulation 
and the efficient execution of complex database queries 
using a single query expression against the database. 

BACKGROUND OF THE INVENTION 

The amount of multimedia data available in electronic 
format is ever increasing. The cost of loading such data into 
a database is quite high and it is desirable that this task does 
not have to be repeated wheo writing different applications 
which use such data. Furthermore it is desirable to be able 
to add different databases to a system without the need to 
rewrite the application in a major way. In general, a rela- 
tional database comprises tables which contain records that 
have a zero-to-many relationship to records in other tables. 
A query is formulated against one or many tables as appro- 
priate and upon execution returns a set of records. To gel the 
desired user query resolved, several sub-queries queries may 
have to be formulated, and then the results of each of these 
sub-queries queries combined. 

For example, assume a DB2 (IBM™) database which is 
populated with several tables. Each table has many records 
(rows) and many columns. A user can pose a query like: find 
aU the DEALERS which have PINK CADILLACS in 
STOCK (referred to as parametric query). In this example, 
there are at least the following columns in the database: 
DEALERS, COLOR, MAKE, AVAILABILITY. (This is a 
straightforward database example). Now lets assume that 
with each row in the table, there is also a textual description 
column. Some databases like DB2 have a special method 
(called DB2 TextExtenders, IBM™) to search such textual 
columns for the occurrence of a string or a logical expression 
of words (c.(g. USED or NEW). A multi-search query would 
for instance extend the above query by adding the query 
condition "USED or NEW". One way to execute the query 
is to first execute the parametric query and store its results 
in an application, then execute the textual query and store its 
result in the application. The application then combines the 
results of the two sub-queries queries (e.g. parametric and 
textual) for a final result. The problem is that each of the 
sub-queries queries may return a big result set, which is 
expensive to transmit from the database to the application. 
Furthermore, combining results from the sub-queries queries 
is expensive. 

OBJECTS OF THE INVENTION 

An object of this invention is an improved database query 
system and method. 

An object of this invention is novel object oriented query 
data model. 

An object of this invention is to formulate a single valid 
query against a relational database which eliminates the use 
of intermediate result sets and uses the database for perfor- 
mance optimization while maintaining some flexibility to 
perform some other optimization. 

An object of this invention is to formulate a single valid 
query as described above which returns in addition to rows 
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from tables from the database also returns computed values 
in some of the result columns. 

SUMMARY OF THE INVENTION 

5 In a preferred embodiment, the present invention works in 
a computer system having one or more central processing 
units and one or more memories. The computer system has 
an interface to one or more databases, one or more base 
query objects, one or more query objects, one or more 

io compound queries, one or more annotator objects and one of 
more graphical user interfaces (GUI's). The base query 
objects have one or more base query object methods, one or 
more base variables, and one or more base objects, one or 
more of the base query object methods being specific to the 

is specific database and capable of querying the specific data- 
base. Each of the query objects derived from one of the base 
objects, and each of the query objects has a query type, one 
or more query object methods, one or more query object 
variables, and one or more query object objects. Each query 

20 object method is capable of querying a specific database to 
obtain a type result having the respective type. The com- 
pound query has one or more compound query methods, one 
or more compound query variables, and one or more com- 
pound query object objects. The operator objects, are 

25 derived from one of the base query objects that are used with 
the specific database. The graphical user interface (GUI) has 
one or more query elements with one or more operators. 
Each query element is one of the query types but being 
database independent, the query elements, operators, and 

30 conditions are user selectable. The process that, for each 
query element, operates on the query object with the same 
type as the query element to create an instance of the query 
object with the query element as one of the query object 
variables, creates one or more operator object instances from 

35 the operator objects corresponding to the operators, and 
operates on the compound query object to create a com- 
pound query object instance. The compound query object 
instance uses the instances and the operator object instances 
to create a query expression for the specific database. 

40 Therefore the input in the GUI is translated into a single 
compound query object. 

There are different methods to evaluate a complex user 
query. In the present invention we propose a very efficient 
way of translating a complex user query into a single query 

45 string in a structured query language. 

The^queryj^bj ects -are • cre ated;byrarclient iprocess. The 
query objects have one or more sub-query objects and one 
or more execute methods that are capable of operating on 
their respective query object to produce one or more query 

50 expressions. All of the execute methods are capable of 
producing the respective query expression that is compatible 
with a structured query language. A'com ppund -quer y- coo- 
(tajns one or more boolean expressions of one or-more ofcth e 
query- objcctSrT hc compound query has one or more com- 

55 pound execute' methods which invoke one or more the 
execute methods of each of the query objects. Each of the 
execute methods returns their respective query expression 
and the compound execute method uses one or more com- 
mon table expressions to combine the query expressions to 

60 form a single compound query expression that represents the 
boolean expression. This single expression can be executed 
against a database to return a result without executing any of 
the query expressions against the database individually. 

65 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram showing one □on-limiting 
example of a preferred embodiment of the present system. 
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FIG. 2 is a block diagram of a novel query object ORIENTED QUERY MODEL AND PROCESS FOR 

architecture of the present invention. COMPLEX HETEROGENEOUS DATABASE QUERIES 

FIG. 3 is a block diagram of a Base Query Object. to Coden ct *U claiming priority to provisional patent 

171/- a • ki 1 j- f t^a. ^ /^u- application No. 60/108,756) which is filed on the same day 

FIG. 4 is a block diagram of a Text Atom Query Object. $ ^ applicatioQ aod £. herein mcorponteA by reference ? 

FIG. 5 is a block diagram of an Parametric Attribute its entirety. The invention will be described in the descrip- 

Query Object, tion of FIGS. 20-23. 

FIG. 6 is a block diagram of a' Feature Atom Query FIG. 1 is a block diagram of one preferred system 100 

Object. used in performing the process 1000 of a preferred embodi- 

F1G. 7 is a block diagram of an Operator Query Object. 10 ment of present invention. This non limiting example 

FIG. 8 is a block diagram of a Parenthesis Query Object. ^T^Tr™ " ^ T^f 

<v . , , „ J (trademark of IBM Corp.) running an operating system like 

FIG. 9 is a block diagram of a Compound Free Text Query ^ e Microsoft windows NT 4.0 operating system (trademark 

° b J ect - of Microsoft Corp.) and IBM Database 2 Single User 

FIG. 10 is a block diagram of a Compound Boolean Text 1$ Edition for Windows NT, version 2.1.2 or higher (trademark 

Query Object. of IBM Corp.), or equivalent. The data processing system 

FIG. 11 is a block diagram of a Compound Parametric 100 includes a processor 102, which includes a central 

Query Object. processing unit (CPU) 104 and memory 106. Additional 

FIG. 12 is a block diagram of a Compound Feature Query memory, such as a hard disk file storage 108 and a remov- 

Object 20 able media device 110 having removable media 112 may be 

FIG. 13 is a block diagram of a Boolean Compound conn £ ted to Additional memory like 140 

Ouerv Obiect can connected via a network and contain one or more 

^ 1, . , . ' p . , , . databases 150. The removable media device 110 may read 

FIG. 14 is a flowchart of a typical execute method which from mdf Wlilcg tQ lfae removable media m 

is a part of Typed Compound Query Object. ^ Examples of me removable media 112 include: a magnetic 

FIG. 15 is a flowchart of a typical linearize method which tape, a compact disk-read only (CD-ROM), write once 

is a part of a Base Query Object and all Derived Query (CD-R) or rewritable (DC_RW) memory, and any other 

Objects. well known readable and writable media. Memory 106, 108, 

FIGS. ISA, 15B and 15C show three representations of 112 may have computer program code recorded therein that 

the same Boolean query expression, exemplifying the con- 30 implements portions of the present invention in the data 

cepts "recursive nesting," "hierarchical tree", "parent" and processing system 100. Inputs may also be received from 

"child Query Objects". input devices that could include: a fax/modem 114 or 

FIG. 16 is a flowchart of a typical execute method which network interface card 114A, which is connected to a 

is part of a Boolean Compound Query Object or a Com- telephone line 132 and/or a local area or wide area network 

pound Feature Query Object. ' 35 116 e S- ^ e Internet. The data processing system 100 also 

FIG. 17 is a block diagram of a Result Object. can include user interface hardware^ such a^ a pointing 

_ 40 . . 1 device (e.g. a mouse) 120, a keyboard 122, an optical 

FIG. 18 is a block diagram of a Common Table Expres- U8 and a raicrophoQC m for allowing user input 

sion Query Object. to mc proccssor 102. Th c data processing system 100 may 

FIG. 19 is a flow chart of a Compound Query Instantiation w have output devices that could include: one or more visual 

process that creates the necessary instances of Typed display devices (e.g. a monochrome or color display monitor 

Elementary Query Objects, Typed Compound Query 126) and/or a monochrome or color display printer 128, for 

Objects, Annotator Objects and Boolean Compound Query rendering visual information. For instance, the Graphical 

Objects containing a query expression formulated with the User Interface (GUI) can use 126 to display the query 

previously mentioned Derived Query Objects which encap- 45 element 134, query operators 136 and conditions 138 which 

sulatc a user query from input from a Graphical User are used to specify the user query. In some alternative 

Interface. embodiments, the data processing system includes: an audio 

FIGS. 19A and 19B show examples of GUI query ele- display device, such as a speaker 130, for rendering audio 

ments. information. A Telephone 132 may also be connected to the 

FIG. 20 is a flowchart of a high performance execute 50 telephone line 116. 

method in a Boolean Compound Query Object or a Com- Process 1900 is stored in one or more of the memories 

pound Feature Query Object. (e g. 105, 108) and executed by one or more of the CPUs 

FIG. 21 is a flowchart of a high performance execute 

method in a Compound Free Text Query Object. 0nc problem addressed here is to be able to have any 

FIG. 22 is a flowchart describing the process of entering 55 S encra ! GUI accessing any general database with any 

data into a hashtable as used in the high performance execute f neral n pnor art an application had to be rewritten 

method as using in FIG. 21. to f «*™»» te a new GUI ' or a ™* d J? ab 1 aS0 ° r T^? 

6 make performance improvements. We disclose an flexible 

FIG. 23 is a flowchart describing an alternative process of and modular architeclure which creates a GUI and database 

entering data into a hashtable as used in the high perfor- 60 independent representation of a user query which enables a 

mance execute method as using in FIG. 21. GUI or database to be changed independently. Performance 

DFTAlT Fn nPSPRlPTlON OF THF improvements can also be made without the need of rewrit- 

DETAILED Dj^WraON OF THE ^ ^ wholc application . 

FIG. 2 shows a hierarchy of objects — Base Query Objects 

A preferred architecture for using the invention is 65 220, Annotator Objects 290, Typed Elementary Query 

described in FIGS. 1-19 below and is further described and Objects 260, Typed Compound Query Objects 340 and 

claimed in U.S. patent application entitled A OBJECT Boolean Compound Query Objects 350. This set of objects 
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are the core of the architecture. All the above mentioned 
objects are derived from a base query object (standard 
Object Oriented Programming meaning). Each base query 
object knows how to connect to a specific database, send 
queries to a database and receive results. Annotator Objects 
290 are a convenient representation for operators and paren- 
theses. Typed Elementary Query Objects 260 capture the 
basic building blocks of a query, like a text string, or a 
attribute, operator, value triplet (e.g., Movie Producer- 
Hitchcock). Typed Compound Query Objects 340 are used 
to express a complex query of a specific type, for instance 
a query of the form "SCARY SHOWER SCENE OR BIRDS 
FLYING" is a textual query composed of two phrases which 
are combined with the operator OR. Such a query would be 
expressed with Typed Compound Query Object 340 which 
would contain the query which would be expressed as the 
OR of two Typed Elementary Query Objects 260. A Boolean 
Compound Query Object 350 is used to express the com- 
plete user query which is composed of sub-queries queries 
of different types. The above mentioned example could be 
augmented to include the parametric query "DIRECTOR 
HITCHCOCK" and the AND operator could be used to 
express the fact that the user wants to find two particular 
Hitchcock movies ("Psycho" and "The Birds"). 

Process 1900 maps the information entered in the GUI 
into Derived Query Objects 360 which are described in 
detail in FIGS. 4-«. The Typed Elementary Query Objects 
260 are derived from a Base Query Object 220 and contain 
one or more objects and methods which are able to set and 
write typed information from/to the GUI and map the 
information into a database dependent query language. Pro- 
cess 1900 then continues by using information either from 
the GUI or independently created to instantiate Typed Com- 
pound Query Objects 340 which express Boolean combina- 
tions of Typed Elementary Query Objects 260 by type and 
are described in more detail in FIGS. 9-12. The process 
1900 then continues by using information either from the 
GUI or independently created to instantiate a Boolean 
Compound Query Object 350 which expresses the user 
query using Typed Compound Query Objects 340 and is 
described in more detail in FIG. 13. Annotator Objects 290 
are used to express the Boolean combinations of objects 
both within the Typed Compound Query Objects 340 and the 
Boolean Compound Query Object 350. The Boolean Com- 
pound Query Object 350 encapsulates the user query in a 
GUI and database independent format. 

To attach a different database the connection method in a 
base query object has to be changed. If the structured query 
language is different, only new methods expressing the 
change have to be added. A different GUI may entail no 
changes, or some changes in process 1900 which maps new 
graphical user interface components into Typed Elementary 
Query Objects 260. The core of an how an application parses 
a complex query and executes it would rem am unchanged. 

An application using the architecture described in this 
disclosure would be written in the following fashion. Each 
application has a user interface, most likely a graphical user 
interface (GUI). Each database which is part of the appli- 
cation has tables which have names aad each of the tables 
has columns which are named in turn. The names used by 
the database may be not suitable for the application. An 
application builder can create a file which maps the database 
names into mare user friendly names. 

In the first step the GUI is mapped into Typed Elementary 
Query Objects 260. The names of these query objects 
suggest that they are the basic building blocks which can be 
used to express more complex user queries. For example, a 
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Text Atom Query Object 230, encapsulates a single text 
string. In the next step, other aspects of the user interface are 
taken into account to create Typed Compound Query 
Objects 340, Annotator Objects 290 and a Boolean Com- 

5 pound Query Object 350 which ultimately describes the 
whole user query. The aspects of the user interface taken into 
account are which combinations of the basic query elements 
the user chose for the query. For example, besides specifying 
a text string, the user may specify also the resulting video for 

lQ instance was aired after Aug. 14th, 1953 and was produced 
by either an American or Canadian company. A user inter- 
face may allow the user to specify all possible combinations 
of the building blocks or have default values on how they 
should be combined. 
The Query Objects (220, 360) and their methods are some 

15 of the novel features of the disclosure. They have several 
characteristics in common. They provide for a user interface 
and database independent representation of a user query. 
They have methods to render them in a specific user 
interface, to translate them into an expression which is 

20 specific to a database and can be (either alone or in con- 
junction with other expressions) executed against the data- 
base. Note that the execute method is type specific. For 
instance, a Compound Free Text Query Object 300 has an 
execute method which can perform a free text query which 

25 returns ranked results. The execute method of a Compound 
Parametric Query Object 320 in turn returns rows which 
satisfy the query condition. They may contain a method to 
render the query into a structured query language which the 
user can manipulate to specify a query. They have objects to 

30 store results of a query and translate them into a user 
interface and database independent format which can be 
used for post processing if so desired. Continuing the 
necessary steps in an application, the query condition (e.g., 
the text string and the date in the above mentioned example) 

35 expressed with the above mentioned objects and captured in 
t "the Boolean Compound Query Object 350 (see description 

| of FIG. 13 for more details) is transformed into a Boolean 

| expression in infix notation composed of Typed Compound 

( Query Objects 340 and Annotator Objects 290. rThis"Boolean> 

^exprej^on-is^h^e^^^ 

Object 340;cxecutcsjhj_qucry. it encapsulatesand. the; results 
arc.thcnrcombincd accordi ng- to*. the: precedence-rules- of; the 
Boolean expression - and~the operators and parenthe sis- use d 
.to;instantiat e-a r esulfobject (FlGr-17)^hiciris^gain a-user 

45 inter face - and c^ baseTmdependent rdbject?* 

The architecture is expandable to include new Query 
Objects at all levels (Typed Elementary Query Objects 260, 
Annotator Objects 290 and Typed Compound Query Objects 
340) as long as they follow the same structure to include 

50 type specific execute and rendering methods. A'systenTusihgr? 
mis:architecture;can be"^amtained;and:modi fied at a- modu £> 
larlevelPFor instance, when a new user interface is desired, 
some new rendering methods may have to be added to the 
system. A system and application using this data model can 

55 add/change a graphical user interface (GUI) without impact- 
ing the database query operation. Another example of the 
flexibility of the system is when adding a new different 
database which uses a different structured query language. In 
that case some of the execute methods may have to be 

60 changed but the rest of the structure and the application can 
remain unchanged, and there is no need to change the GUI 
(unless new functions are added which should be exposed to 
the user). Details of the database layout are hidden from the 
GUI and hence can be changed without impacting the 

65 application. 

FIG. 2 is a block diagram of the novel Query Object 
architecture 200 of the present invention. The architecture 



06/04/2004, EAST Version: 1.4.1 



US 6,3' 

7 

allows for one or more homogeneous databases and/or two 
or more heterogeneous databases 210. For each database, 
there is a base query object 220, which contains an object 
311 encapsulating the connection to its database and an 
object 315 encapsulating a connection to a GUI in the 
system. All query objects mentioned in the description of 
this figure arc described in more detail in subsequent figures. 

There arc many query objects which can be derived from 
a Base Query Object 220 using well known Object-Oriented 
techniques. Each of these Derived Query Objects 360 has a 
query type associated with it. Some preferred Derived Query 
Objects 360 shown in FIG. 2 are: the Typed Elementary 
Query Objects 260, the Annotalor Objects 290, the Typed 
Compound Query Objects 340 and the Boolean Compound 
Query Objects 350. Some preferred Typed Elementary 
Query Objects are the Text Atom Query Objects 230, the 
Parametric Attribute Query Objects 240 and the Feature 
Atom Query Objects260. Some preferred Annotator Objects 
arc the Operator Query Objects 270 and the Parenthesis 
Query Object 280. Some preferred Typed Compound Query 
Objects are the Compound Free Text Query Object 300, the 
Compound Boolean Text Query Object 310, the Compound 
Parametric Query Object 320 and the Compound Feature 
Query Object 330. 

The architecture is not limited to these Query Objects and 
can accommodate others. Each Query Object contains mem- 
ber objects and methods which are described in more detail 
in FIGS. 3-18. At least one of the methods in each of the 
Derived Query Objects 360 can upon execution produce a 
valid structured language string which in conjunction with 
other valid strings could be executed against the database 
210 and return Results. Furthermore, each Derived Query 
Object 360 contains a method 390 which can render the 
query expression in a graphical user interface or in a user 
interface structured Query Language. Each Typed Com- 
pound Query Object 340 contains an object which describes 
a query and an object which upon execution of the query 
contains the results. A Boolean Compound Query Object 
350 contains a query expression (a Boolean expression of 
Typed Compound Query Objects 340 and Annotator Objects 
290), at least on linearize method which takes the query 
expression and transforms it into postfix notation, and at 
least one execute method which performs the following 
steps: execute the query as encapsulated in the query expres- 
sion in each of the Typed Compound Query Objects 340 and 
then combine the results of each of these sub-queries queries 
according to the rules of the Operator Objects 270. The 
algorithms for combining are implementation specific and 
could also be chosen by the user. 

FIG. 3 is a block diagram of a typical Base Query Object 
220. A Base Query Object 220 contains objects (311, 315, 
321, 325, 331, 335) methods (351, 355, 361, 365, 370, 375, 
380, 385, 390, 395). Each database in the system has its own 
Base Query Object 220. Its main function is to know how to 
communicate with its associated database, in particular how 
to send a query and receive the results. Other important 
methods are: how to transform a query expression into a 
query using postfix notation (370), take results and trans- 
form them into a database independent format (380) and 
rendering methods (390) which can show data in the GUI. 

All the objects and methods in a Base Query Object 220 
are now described in more detail. Object 311 describes a 
connection to a database 210, in particular it contains all the 
information necessary to write execute methods 375 which 
can submit a query to its associated database and retrieve the 
results. In a preferred embodiment, the connection is to DB2 
and object 311 contains methods which allow SQL strings to 
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be executed and their results received. Object 315 describes 
the connection to the Graphical User Interface (GUI). In one 
preferred embodiment mappings from names known to the 
database to names preferred by the user are done in the GUI 

5 and have to be accessible to the Base Query Object 220. 
Object 321 contains a query expression and there are many 
different structures which accommodate such an expression. 
However, all structures represent a valid Boolean expression 
of Derived Query Objects 290 in an infix notation. Objects 

10 325 are optional as an application or implementation sees fit. 
Object 331 describes which values the user requested as 
results for the query. Object 335 can hold the results of an 
executed query in a database and user interface independent 
structure. There are many suitable implementation of such 

1S an object. In one preferred embodiment, the results are table 
(a two dimensional array) with a separate linked list which 
contains the column headings and their types. The Base 
Query Object 220 is used to derive the Derived Query 
Objects 360 according to Oriented Programming rules. 

20 Object 341 is a Compound Query Expression and there 
are many different structures which can accommodate such 
an expression. A Compound Query Expression 341 captures 
the same Boolean expression as the Query Expression 321 
using a recursive representation which can express nesting 

25 without the use of Parenthesis Objects 280. The linearize 
method 371, which is described in more detail in FIG. 15, 
transforms a Compound Query Expression 341 into a Query 
Expression 321. It will be apparent in FIG. 19 how Com- 
pound Query Expressions are a useful way to represent an 

30 end user query as expressed in a GUI. A Query Expression 
321 on the other hand is an useful representation for execut- 
ing a user query against a database in a fast manner. 

A Base Query Object 220 contains many methods. The 
method ADD 351 can insert Typed Elementary Query 

35 Object 260, a Annotator Object 290 or a Typed Compound 
Query Object 340 into the query expression 321. The 
method REMOVE 355 can remove any of the Query Objects 
from the query expression. Methods 360 are "set" and "get" 
methods. For each Object contained in the Base Query 

40 Object 220 there is a "set" method which can set variables 
and constants and a "get" method to retrieve them. Methods 
365 are optional as an application or a specific implemen- 
tation requires. Methods 371 — the linearize methods — can 
transform a query expression into a Boolean expression into 

45 postfix format. There can be multiple linearize methods, at 
least one per structure which holds the query expression and 
there could be multiple ones depending on an actual imple- 
mentation. Methods 375 are execute methods which execute 
a query against the database and retrieve results. These 

50 methods use the connection object 311 described above. 
Method 380 transforms results returned from the database 
into a database independent format which can be used by 
other query objects including the GUI. In particular, method 
380 knows the database specific format in which the typed 

55 results are returned from the database. In general the results 
can have different types like integers, floating point 
numbers, strings to point out a few non limiting examples. 
In one preferred embodiment, there arc different functions to 
extract each of these types from the result object (FIG. 17) 

60 as returned from the database and put it into a database 
independent array. The types of the results (integer, floating 
point, string etc.) are preserved in that operation. The array 
has a header which for each column associates the column 
name, the database table it belongs to and the type. Hence 

65 enough information is retained to know from where the 
results came, however, the array itself (or any other equiva- 
lent representation) is database independent. As a 
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consequence, two such arrays could each contain results 
from queries executed from different databases. Some of the 
aspects of this methodology are covered in U.S. Pat. No. 
5,873,080, entitled "Using Multiple Search Engines to 
Search Multimedia Data" issued to Anni R. Coden et al. oo 
Feb. 16, 1999 which is herein incorporated by reference in 
its entirety. Methods 385 create strings in the structured 
query language which is used by the database specified in 
the connection 311. Such strings (either by themselves or in 
conjunction with other strings) can then be executed against 
the database (using an execute method 375) to produce 
results. In a preferred embodiment, methods 385 create SgL, 
strings. A common SQL statement has the following format: 

SELECT<columnl, column! . . . >FROM<tablcl>WHERE<qucry 
conditio n> 

The keywords SELECT, FROM and WHERE are fixed 
SQL keywords. The columns <columnl, column2, eto arc 
the columns which the user specifies in a query. They can be 
specified in any form the GUI builder finds convenient and 
method 385 translates GUI specifications into column names 
as known by the database. In tbe same fashion, the user may 
have specified which table to search. Such specification 
could be directly by specifying a name or indirectly by 
specifying what type of query is performed. In the later case 
for example, the user could have specified that it is looking 
for an author of a book and the system knows which tabic 
is associated with such a search. The query condition is 
again specified by the user in different ways. For example, 
the user could have specified to search for authors of books 
which have been published in the year 1998 and whose 
subject is parenting. In this example the query condition 
specified in SQL would be SUBJECT-' Parenting' and could 
be specified in the GUI in many different ways. 

Another method 385 creates a string which translates the 
query condition into structured language as defined by the 
user interface. Methods 390 — the render method — use ele- 
ments of the graphical user interface to display the query 
expression. Method 395 is a non limiting example of a 
method which can be applied to results to suit a particular 
application. In one preferred embodiment, method 395 is a 
padding method. For this method to be applied, the results 
contain columns which describe a start and end point of a 
time interval. A resulting time interval could be too small for 
the application to use. In this case "too" small intervals 
would be extended by either a fixed value or by a calculated 
amount as specified by the application. 

The above mentioned objects and methods use standard 
object oriented programming technology. However, the type 
of objects and methods which comprise a Base Query Object 
220 are novel and are necessary to implement the overall 
architecture of an object oriented query model. Each imple- 
mentation of this architecture could very slightly. 

The Typed Elementary Query Objects 260 and the Anno- 
tator Objects 290 can be considered building blocks of the 
Query Object Architecture and are described in FIGS. 4-8 in 
more detail. They contain the simplest forms of query 
expressions, like a textstring or a parametric attribute (e.g., 
PRODUCER-' HITCHCOCK') to name two. Such query 
expressions can then be composed into more complex 
queries using the annotate r objects which then are used to 
instantiate Typed Compound Query Objects 340. A Typed 
Elementary Query Object 260 can be used to form query 
expressions in different Typed Compound Query Objects 
340. For instance, a query expression in a Compound 
Freetext Query Object 300 and in a Compound Boolean Text 
Query Object 310 use Text Atom Query Objects 230 in 
them. 
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FIG. 4 is a block diagram of a Text Atom Query Object 
230. Object 410 holds a basic text string which can be 
specified in the user interface. Such a textstring can be used 
to form more complex queries which are captured in Typed 

5 Compound Query Objects 340. This Query Object is a 
building block, has methods for translating the string into a 
structured query language (e.g., adding the correct 
punctuation) and methods for rendering it in a GUI. 

Object 420 and 430 are optional and their necessity 
depends on the sophistication of the Text Search Engine 
within the database. In one preferred embodiment, the 
database is DB2 and DB2 TextExtender is used to perform 
a text search. Within DB2 TextExtender, the language of the 
text string can be specified using Object 420 and DB2 
TextExtender supports three types of indices (linguistic, 

15 precise and dual). Object 430 is used to specify the type of 
index desired. The Text Atom Query Object 230 can contain 
additional optional objects 440 as deemed necessary by the 
application or the implementation. 
Methods 450 are multiple "set and get" methods for the 

20 variables, constants and objects within the Text Atom Query 
Object 230. Methods 460 translate the text string into 
structured language strings as required by either the database 
210 or the user interface structured language. Methods 480 
are multiple rendering methods which use user interface 

25 elements to show the text string 410. 

FIG. 5 is a block diagram of a Parametric Attribute Query 
Object 240. This query object is a building block in forming 
more complex expressions which are captured in Typed 
Compound Query Objects 340. The information captured in 

30 a Parametric Query Object 240 is the relation as requested 
by a user between a particular column and a user specified 
value. Although a Parametric Attribute Query Object 240 
does not contain explicit methods to execute a query, it 
contains methods on bow to translate the user specification 

35 into a structured query language and methods to render them 
in a GUI. 

Object 510 holds an attribute which is of the form 
<textstringxoperator><value>. The operator can be any the 
database 210 supports and the value can be either a text 

40 string or a numeric value. The Parametric Attribute Query 
Object 240 can contain additional optional objects 520 as 
deemed necessary by the application or the implementation. 

Methods 530 are multiple "set and get" methods for the 
variables, constants and objects within the Parametric 

45 Attribute Query Object 240. Methods 550 translate the 
attribute into structured language strings as required by 
either the database or the user interface structured language. 
Methods 560 are multiple rendering methods which use user 
interface elements to show the attribute 510. 

50 FI G. 6 is a block diagram of a Fe ature Atom Query Object 
250. This Query Object is a building block and quite similar 
to the Parametric Attribute Query Object 240. However, it 
has one additional feature: the user specifies a column, an 
operator and a value, but it does not specify the datatype of 

55 the column. For instance, a user may specify IMAGE= 
MOSTLY GREEN. This may translate to a particular region 
in a color palette, the meaning of the operand being non 
standard and the formulation of a query varying from the 
standard approach. 

60 Object 610 holds an attribute which is of the form 
<textstringxoperator><value>. The operator can be any the 
database supports and is appropriate for the data type of the 
value which can be of any "standard" datatype like text 
strings and numbers, or user defined datatypes like histo- 

65 grams and images to give some non limiting examples. ^ 
A feature is a column in a database table whose values are 
strings. For example MOTION MAGNITUDE, ZOOM, 
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FACES are examples of features. Such features can have 
different values and these values can be of different type. 
MOTION MAGNITUDE could be described using floating 
numbers, ZOOM could be described using strings whose 
values could be IN and OUT and FACES could be described 
by integers like 1,2 and 3. A user searching for videos which 
have certain features does not want to be concerned in which 
format such features are described. In particular a user 
would for instance specify to search for videos where the 
MOTION MAGNITUDE-45.7. 

Object 615 describes the data type of value. In one — i 
preferred embodiment the datatypes for different features as I 
denoted in the text string in Object 610 are stored in a table j 
in the database. — 

Hence the GUI designer does not need to know the type 
of the feature when specifying a feature atom as the infor- 
mation can be filled in by the application automatically. 
However, if the GUI needs to know the type for display 
purposes, it can query the system for it. 

Object 620 describes which values (i.e., which columns in 
which tables) the user requested as results for the query. 
Object 625 holds the results of executing a query. FIG. 17 
describes this result object in more detail. 

The Feature Atom Query Object 250 can contain addi- 
tional optional objects 630 as deemed necessary by the 
application or the implementation. Methods 640 are multiple 
"set and get" methods for the variables, constants and 
objects within the Feature Atom Query Object 250. Methods 
660 translate the feature attribute into structured language 
strings as required by either the database 210 or the user 
interface structured language. Methods 670 are multiple 
rendering methods which use user interface elements to 
show the feature attribute 610 and the feature type 620. 

FIG. 7 is a block diagram of an Operator Query Object 
270. This Query Object is a convenient implementation of 35 
operators like AND, OR, LIKE, EQUAL within this archi- 
tecture. However they have the additional feature that one 
can use methods within an Operator Query Object 270 to 
overwrite the standard meaning of an operator. For instance, 
EQUAL may mean that two textstrings are equal if there are 40 
no differences between them or it they can be called equal 
when they contain the same letter independent of the punc- 
tuation and capitalization. 

Object 710 is a text string describing the operator. The 
Operator Query Object 270 can contain additional optional 45 
objects 720 as deemed necessary by the application or the 
implementation. 

Methods 730 are multiple "set and get" methods for the 
variables, constants and objects within the Operator Query 
Object 270. Methods 750 translate the text string 710 into 50 
structured language strings as required by either the database 
210 or the user interface structured language. Methods 760 
are multiple rendering methods which use user interface 
elements to show the operator 710. 

FIG. 8 is a block diagram of a Parenthesis Query Object ss 
280. This Query Object is a convenient implementation for 
parenthesis which establish a precedence of evaluation 
within a Boolean expression. 

Object 810 is a text string describing the parenthesis. The 
Parenthesis Query Object 280 can contain additional 
optional objects 820 as deemed necessary by the application 
or the implementation. Object 815 — identity — captures 
what type of parenthesis it is, e.g., left/right, round, curly and 
square to name a few non limiting examples. 

Methods 830 are multiple set and get methods for the 
variables, constants and objects within the Parenthesis 
Query Object 280. Methods 850 translate the text string 810 



into structured language strings as required by either the 
database 210 or the user interface structured language. 
Methods 860 are multiple rendering methods which use user 
interface elements to show the parenthesis 810. 

The following is some pseudo code which capture the 
essence of Typed Elementary Query Objects 260 and Anno- 
tator Objects 290. Each of the specific objects may contain 
only some of the proposed code, others some additional one. 
public class TypedElementaryQueryObject extends Base- 

QueryObject { 

private String name; 

private String value: 

private String type; 

private OperatorObject operator, 

private LinkedList resultcolumns; 

private ResultObject results=new ResultObject( ); 

public TypedElementaryQueryObjcct(String N, String V, 
String T, OperatorObject O) { 
name=N; 
value =V; 
type-T; 
operator=0; 



} 



60 



65 



public ResultObject getResults( ) { 
return results; 

} 

public void setResults(ResultObject r) { 
results-r; 

} 

public void setOperator(OperatorObject op) { 
operatomop; 

} 

public OperatorObject getOperator( ) { 
return operator; 

} 

public void setresultcolumns(LinkedLtist 1) { 
resultcolumns-1; 

} 

public void setName(String s) { 
name=s; 

} 

public String getName( ) { 
return name; 

} 

public void setValue(String s) { 
value-s; 

} 

public String getValue( ) { 
return value; 

} 
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public void setType(String s) { 
type-s; 

} 

public String gctTypc( ){ 
return type; 

} 

public int whoAmI( ) { 
return 18; 

} 

public String asGUIString ( ) { 

String s-T+index+": "+value+")"; 
return s; 
} 

public String asSql( ) { 
String s-" 
String tcmp=value; 
if (type^tartsWithC'TEXT")) 

tempo"' "+temp+" "'; 
s="FEATURE='"+name + '"AND" + column + 

operator.asSql( )+temp; 
return s; 

} 

public ResultObject createExecute(String s) { 
results=createExecute(s, null, true); 
return results; 

) 

public ResultObject createExecute(String s, CohimnDefi- 
nition c, Boolean dir) { 
String sqlString=s+"WHERE"+asSql( ); 
supe r.sctrcolumns(rcolum ns); 
rcsults-exccuteStatement(sqlString, c, dir); 
return results; 

} 

public String createSQLexpression(String s) { 
String sqlString=s+"WHERE"+asSql( ); 
return sqlString; 



FIG. 9 is a block diagram of a Compound Free Text Query 
Object 300. This object is used to represent a complex free 
text query. A free text query searches documents for the 
words specified in a query in a specially build index which 
is part of the database. Documents containing some or all of 
the words are returned in a rank ordered fashion, where the 
rank is roughly a function of the number of occurrences of 
the specified words in all the indexed documents in the 
database. (Note that different free text search engines com- 
pute the rank differently.) In one preferred embodiment, 
DB2 TextExtender, the free text search engine can support 
multiple indices (precise, linguistic and dual) as described 
FIG. 4 which shows the Text Atom Query Object 230. 
Furthermore, a user query may want to add some additional 
constraints which would narrow down the number of docu- 
ments which should be searched for the occurrences of the 
words — the query condition 915 captures such constraints. 

Object 910 holds the query expression, a collection of one 
or more Text Atom Query Objects 230. Object 915 encap- 
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sulates the query condition if the user wants to restrict the 
number of documents searched. For example, a query may 
want to find the documents which contain the word 
IMPEACHMENT and were written before Jan. 1st, 1998. 

5 The date condition in this example is a non limiting example 
of a query condition. 

Object 920 denotes indices used to perform a free .text 
search. One preferred embodiment of a free text search 
Q engine is DB2 Extenders 0BM™) which has special col- 
umns within DB2 which hold the information necessary to 
perform a free text search. Object 925 contains the specifi- 
cation of the column which is searched for the text specified 
in the query expression 910. Object 930 describes which 

15 values (i.e. which columns in which tables) the user 
requested as results for the query. Object 940 holds the 
results of executing a query. FIG. 17 describes this result 
object in more detail. Objects 950 represents optional 
objects which are system and implementation specific. 

20 A Compound Free Text Query Object 300 has multiple 
"set and get" methods 960 for the constants, variables and 
objects within itself. Methods 970 are a set of rendering 
methods which can write for instance the query condition or 
the query expression and the results to the GUI. It is up to 

25 the application what information gets explicitly exposed to 
the user (rendered) or which values are defaulted. Some of 
the rendering methods can be common to several of objects 
and be implemented in the Base Query Object leading to an 
efficient implementation of the system. 

30 Methods 965 are multiple execute methods. These meth- 
ods 965 allow for a modular and flexible system and they are 
type specific. The execute method in the Compound Free 
Text Query Object 300 knows how to assemble a structured 
query language string which when executed returns a set of 

35 ranked documents. One of the execute methods in the Base 
Query Object 220 knows how to take such a structured 
language string and "package" it correctly so that it can be 
shipped to a database. As a result, the Compound Free Text 
Query Object 300 does not need any knowledge about the 

^ communication between the application and the database. 
Conversely, the Base Query Object 220 does not need any 
knowledge about how to assemble a free text query and deal 
with multiple indexes. 

45 Methods 975 arc a set of methods to create structured 
query language strings, either in support of the execute 
method or in support of the query language as implemented 
by the GUI. 

Another important aspect of the architecture captured in 

50 this disclosure is the ability to accommodate different meth- 
ods which could enhance performance without the need to 
rewrite the whole application. Methods 980 are preferred 
embodiments of such methods. The outputs of such methods 
are strings in a structured query language which can be 

55 combined with strings created by similar methods in other 
typed query objects which then create an expression in a 
structured query language which captures the user query and 
can execute it very efficiently using Common Table Expres- 
sions. Such expressions allow for temporary tables to be 

60 constructed in a database which can be used for storing 
intermediate results. Such tables eliminate the need to load 
(potentially big) intermediate results making processing 
more efficient 

FIG. 10 is a block diagram of a Compound Boolean Text 

65 Query Object 310. A Compound Boolean Text Query Object 
is quite similar to a Compound Free Text Object 300. 
However, a Boolean text query returns a set of documents 
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which satisfy a specified constraint. A document can either FIG. U is a block diagram of a Compound Parametric 

satisfy a certain constraint or not, hence ranking result Query Object 320. A parametric query is the most basic 

documents does not apply to this type of queries. On the query against a database. In general it has the form to find 

other band, for a free text query, basically, only a set of all documents where specified constraints hold. A basic 

words can be specified. In Boolean text search, the textual 5 constraint is of the form <attributexoperatoi><value> as 

query condition can be quite complex: operators between the described in the Parametric Attribute Query Object 240 

words ranging from simple and/or to 'in the same sentence (FIG. 5). General constraints are Boolean expressions of 

as', parenthesis, synonyms etc., the complexity defined by parametric attributes. 

the underlying text search engine. Object 1110 holds the parametric query expression which 

Object 1010 holds the query expression, a Boolean constitutes the query. For example a parametric query 

expression of one or more text atom query objects. Object expression would be: find the television programs whose 

1025 holds a Compound Query Expresssion representing the SUBJECT is IMPEACHMENT AND whose NARRATOR 

same Boolean Expression as Object 1010. In FIG. 3 in the is CRONKITE. The query expression is a Boolean expres- 

description of Objects 321 and 341, these two Objects are sion of Parametric Attribute Query Objects 240. Object 1115 

explained in more detail. Object 1015 denotes indexes used holds a Compound Query Expression representing the same 

to perform a Boolean text search. One preferred embodiment 15 Boolean Expression as Object 1110. In FIG. 3 in the descrip- 

of a Boolean text search engine is DB2 Extenders (IBM™) tion of Objects 321 and 341, these two Objects are explained 

which has special columns within DB2 which hold the in more detail 

information necessary to perform a Boolean text search. Object 1120 describes which values (i.e., which columns 

Object 1020 describes which values (i.e., which columns in in which tables) the user requested as results for the query, 

which tables) the user requested as results for the query. 20 Object 1130 holds the results of executing a query. FIG. 17 

Object 1030 holds the results of executing a query. FIG. 17 describes this result object in more detail. Objects 1140 

describes this result object in more detail. Objects 1040 represents optional objects which are system and implemen- 

represents optional objects which are system and implemen- la tion specific. 

tation specific. A Compound Parametric Query Object 320 has multiple 

A Compound Boolean Text Query Object 310 has mul- 25 "set and get" methods 1150 for the constants, variables and 

tiplc "set and get" methods 1050 for the constants, variables objects within itself. Methods 1160 arc a set of rendering 

and objects within itself. Methods 1060 are a set of render- methods which can write for instance the query expression 

ing methods which can write for instance the query condi- and the results to the GUI. It is up to the application what 

tion or the query expression and the results to the GUI. It is information gets explicitly exposed to the user (rendered) or 

up to the application what information gets explicitly 30 which values are defaulted. Some of the rendering methods 

exposed to the user (rendered) or which values are defaulted. can be common to several of objects and be implemented in 

Some of the rendering methods can be common to several of the Base Query Object 220 leading to an efficient imple- 

objects and be implemented in the Base Query Object 220 mentation of the system. Furthermore, a rendering method 

leading to an efficient implementation of the system. within the Compound Parametric Query Object 320 could 

Methods 1070 are multiple execute methods. These meth- 35 use a rendering method in a Parametric Attribute Query 

ods allow for a modular and flexible system and they are Object 240. For example, the rendering method within a 

type specific. The execute method in the Compound Boolean Parametric Attribute Query Object knows bow to draw a 

Free Text Query Object knows how to assemble a structured <attributexopcrator><value> triplet, whereas the rendering 

query language string which when executed returns a set of method in the Compound Parametric Query Object 320 

documents satisfying the specified constraint. One of the 40 knows how to draw combinations and relations between 

execute methods in the Base Query Object knows how to such triplets. Again, each object knows how to render one 

take such a structured language string and "package" it thing without the need of any knowledge about the rendering 

correctly so that it can be shipped to a database. As a result, of another object. 

the Compound Boolean Text Query Object does not need Methods 1170 are multiple execute methods. These meth- 

any knowledge about the communication between the appli- 45 ods allow for a modular and flexible system and they are 

cation and the database. Conversely, the Base Query Object type specific. An execute method in a Compound Parametric 

does not need any knowledge about how to assemble a Query Object 320 knows how to assemble a structured query 

Boolean text query and deal with multiple indexes. language string which when executed returns a set of 

Methods 1080 are a set of methods to create structured documents satisfying the specified constraint. One of the 

query language strings, either in support of the execute 50 execute methods in the Base Query Object 220 knows how 

method or in support of the query language as implemented to take such a structured language string and "package" it 

by the GUI. correctly so that it can be shipped to a database. As a result, 

Another important aspect of the architecture captured in the Compound Parametric Query Object does not need any 

this disclosure is the ability to accommodate different meth- knowledge about the communication between the applica- 

ods which could enhance performance without the need to 55 tion and the database. Conversely, the Base Query Object 

rewrite the whole application. Methods 1090 are preferred 220 does not need any knowledge about how to assemble a 

embodiments of such methods. The outputs of such methods Boolean text query and deal with multiple indexes, 

are strings in a structured query language which can be The query expression 1110 is a Boolean expression and 

combined with strings created by similar methods in other the linearize method 370 in a Base Query Object 220 can be 

typed query objects which then create an expression in a 60 used to transform it into a postfix notation or any other 

structured query language which captures the user query and notation which is suitable for evaluation. This is another 

can execute it very efficiently using Common Table Expres- example of the modular and streamlined architecture sug- 

sions. Such expressions allow for temporary tables to be gested here. 

constructed in a database which can be used for storing Methods 1180 are a set of methods to create structured 

intermediate results. Such tables eliminate the need to load 65 query language strings, either in support of the execute 

(potentially big) intermediate results making processing method or in support of the query language as implemented 

more efficient by the GUI. 
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Another important aspect of the architecture captured in information gets explicitly exposed to the user (rendered) or 

this disclosure is the ability to accommodate different meth- which values are defaulted. Some of the rendering methods 

ods which could enhance performance without the need to can be common to several of objects and be implemented in 

rewrite the whole application. Methods 1190 are preferred the Base Query Object 220 leading to an efficient imple- 

embodimcnts of such methods. The outputs of such methods 5 mentation of the system. Furthermore, a rendering method 

are strings in a structured query language which can be within the Compound Feature Query Object 330 could use 

combined with strings created by similar methods in other a rendering method in a Feature Atom Query Object 250. 

typed query objects which then create an expression in a For example, the rendering method within a Feature Atom 

structured query language which captures the user query and Query Object 250 knows how to draw a 

can execute it very efficiently using Common Table Expres- to <feature><operator><value> triplet, whereas the rendering 

sions. Such expressions allow for temporary tables to be method in the Compound Feature Query Object 330 knows 

constructed in a database which can be used for storing how to draw combinations and relations between such 

intermediate results. Such tables eliminate the need to load triplets. Again, each object knows how to render one thing 

(potentially big) intermediate results making processing without the need of any knowledge about the rendering of 

more efficient. 15 another object. 

FIG. 12 is a block diagram of a Compound Feature Query Methods 1270 arc multiple execute methods. These meth- 

Object 330. In many respects a Compound Feature Query ods allow for a modular and flexible system and they are 

Object 330 is very similar to a Compound Parametric Query type specific. The execute method in the Compound Feature 

Object 320, however the differences are quite important. The Query Object 330 knows how to assemble a structured query 

query expression 1210 in a Compound Feature Query Object 20 language string which when executed returns a set of 

330 is a Boolean expression of Feature Atom Query Objects documents satisfying the specified constraint. One of the 

250 whose types are computed which implies the columns in execute methods in the Base Query Object 220 knows how 

the database to be searched. In contrast, the Parametric to take such a structured language string and "package" it 

Attribute Query Object 240 gives a precise specification of correctly so that it can be shipped to a database. As a result, 

the database search. Furthermore, in a Compound Paramet- 25 the Compound Feature Query Object 330 does not need any 

ric Query Object 320, the results are rows (or parts of rows) knowledge about the communication between the applica- 

from the database are returned, which satisfy the query tion and the database. Conversely, the Base Query Object 

expression. In a Compound Feature Query 330 the results 220 does not need any knowledge about how to compute 

are rows (or part of rows) and some additional computed intersections and unions or other combinations of values 

results. The following example should clarify this statement. 30 which are returned from the database. 

Assume that there is database table Fl which has the The query expression 1210 is a Boolean expression and 

following columns: FEATURE, START, STOP, the linearize method 370 in a Base Query Object 220 can be 

INTVALUE, STRINGVALUE. We will focus on two entries used to transform it into a postfix notation or any other 

in the feature column: MOTION MAGNITUDE and notation which is suitable for evaluation. This is another 

ZOOM. Motion Magnitude is specified using an integer, and 3S example of the modular and streamlined architecture sug- 

hence in that row an integer value will be placed in the gested here. 

intvalue column. ZOOM is specified with the text string IN . ^^^^cxecute-memods-in-a-Gompound-Feature^Query_ 

or OUT and hence in the rows where FEATURE is ZOOM, \ c Objcct330 heed to be able to evaluate a Bo olean ex pression^— 

the appropriate values will be placed in the stringvalue There, are" diff ere ntlways~of Idoihg -so.- In- one -preferred 

column. START and STOP refer to the beginning and ending 40 em bodiment leach~Felture£A^^^^ 

of a time interval and they are recorded as appropriate. A itsje^tsTetainedTMore specific the followingsteps can be 

user query of the form: find all videos which have MOTION p^formTdT^ 

MAGN1TUDE=10 and ZOOM»IN and return the time ^1) Execute the query as specified by each of the Feature 

interval when both conditions are true require a computation Atom Query Objects 250 in the Boolean expression 1210. 

of the time interval. Basically the intersection of all time 45 2) Obtain the query Result Objects 625 for each of the 

intervals where MOTION MAGN1TUDE-10 has to be Feature Atom Query Objects 250 in the Boolean expression, 

taken with all intervals where ZOOM=IN. The results will 3) Substitute the query Result Objects 625 of step 2 in the 

return START and STOP columns, however the values in Boolean expression for the Feature Atom Query Objects 

these columns are computed and not just retrieved from the 250. This results in a Boolean expression of Result Objects 

database. 50 where the Boolean expression is identical to the one speci- 

Object 1210 holds a query expression which is a Boolean fied in the Compound Feature Query Object 330. 

expression of Feature Atom Query Objects. Object 1215 4) Combine the results according to the rules of the 

holds a Compound Query Expression representing the same Boolean expression and the operators involved. 

Boolean Expression as Object 1210. In FIG. 3 in the FIG. 16 shows a flowchart of a typical execute method in 

description of Objects 321 and 341, these two Objects are 55 a Compound Feature Query Object 330. 

explained in more detail. Methods 1280 arc a set of methods to create structured 

Object 1220 describes which values (i.e., which columns query language strings, cither in support of the execute 

in which tables) the user requested as results for the query. method or in support of the query language as implemented 

Object 1230 holds the results of executing a query. FIG. 17 by the GUI. 

describes this result object in more detail. Objects 1240 60 Another important aspect of the architecture captured in 

represents optional objects which are system and implemen- this disclosure is the ability to accommodate different meth- 

tation specific. ods which could enhance performance without the need to 

A Compound Feature Query Object 330 has multiple set rewrite the whole application. Methods 1290 are preferred 

and get methods 1250 for the constants, variables and embodiments of such methods. The outputs of such methods 

objects within itself. Methods 1260 are a set of rendering 65 are strings in a structured query language which can be 

methods which can write for instance the query expression combined with strings created by similar methods in other 

and the results to the GUI. It is up to the application what typed query objects which then create an expression in a 
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structured query language which captures the user query and 
can execute it very efficiently using Common Table Expres- 
sions. Such expressions allow for temporary tables to be 
constructed in a database which can be used for storing 
intermediate results. Such tables eliminate the need to load 
(potentially big) intermediate results making processing 
_mqre efficient. 

i above algorithm is just one of many possible. Another 
embodiment for evaluating the Boolean expression is 
described in part in methods 1290. FIG. 20 describes this 
algorithm. Methods 1290 show a different way of evaluating 
^the Boolean expression to create a string in a structured 
"^aery language which is more efficient to evaluate. In one 
preferred embodiment, methods 1290 are common table 
expression methods which create instances of Common 
Table Expression Objects (described later in this disclosure) 
which in turn contain strings which describe the user feature 
I — atom query. Such expressions are used to construct a very 
\efficient (in terms of execution time) query string. 

The following is some pseudo code for a typical Typed 
Compound Query Object 340. Clearly, each type has dif- 
ferent code and may contain some different methods and 
objects. The pseudo code shown resembles the most a 
Compound FreeText Query Object 300. 
public class TypedCompoundQueryObject extends Base- 
QueryObject { 

private LinkedList resultcolumns-new LinkedList( ); 

private LinkedList columnhandles; 

private LinkedList qucryString; 

private String freetext; 

private ResultObject results»null; 

public TypedCompoundQueryObject(LinkedList rc, 
LinkedList ch, LinkedList q) { 
rc.reset( ); 
ColumnDefinition c; 
while (rc.hasMoreElements( )) { 

c=(DB2 ColumnDefinition)rc.nextElement( ); 

resultcolumns.append(c); 

\ 

columnhandles-ch; 
queryString-q; 
freetext-ta.asSql( ); 

} 

public int whoAmI( ) { 
return 9; 

} 

public ResultObject getResults( ) { 
return results; 

} 

public ResultObject executeQuery(ColumnDefinitkm 
sort, Boolean dir) { 

String columns=super.determineSelectedColumns 

(resultcolumns, 0); 
String tables-super.determineTab!es(resultcolumns); 
String handles-super.createCommaString 

(columnhandles); 
String sqlString~"WITH TEMPTABLE("+columns+", 

RANK) AS"; 

sqlString=sqlString+"(SELECT"+columns+'\ 
DB2TX.RANK("+handles+", '"+freetext+"') 
FROM"+tabIes; 



sqlString=sqlString+determineWhere(queryString)+") 

sqlString»sqlString+"SELECT * FROM TEMPT- 
ABLE WHERE RANK>0 ORDER BY RANK 
5 DESC"; 

ColumnDefinition c-new ColumnDefinition("RANK", 

"DOUBLE", "NULL"); 
resultcolumns.append(c); 
super.setrcolumns(resultcolumns); 
10 results=super.executeStatement(sqlString, sort, dir); 
return results; 

} 

} 

15 

FIG. 13 is a block diagram of a Boolean Compound 
Query Object 350. The query expression in a Boolean 
Compound Query Object 350 represents the complete user 
query composed of Typed Compound Query Objects 340 
20 and Annotator Objects 290. Typed Compound Query 
Objects 340 in turn have query expressions composed of 
Typed Elementary Query Objects 260 and Annotator 
Objects 290. 

Object 1310 holds the Boolean Query Expression which 
25 constitutes the query. Object 1315 holds a second represen- 
tation of the query called the Compound Query Expression. 
Both query expressions (1310 and 1315) express the same 
Boolean Expression using different representations and were 
shown in FIG. 3 as Objects 321 and 341. The Boolean Query 
30 Expression consists of instances of Typed Compound Query 
Objects 340 and Annotator Query Objects 290. For example, 
a Boolean expression could be of the form "(Text Query 
AND Feature Query) OR Parametric Query." The Com- 
pound Query Expression 1315 is described in detail in FIGS. 
35 15 and 19. 

Object 1320 holds the results of executing a query. FIG. 
17 describes this result object in more detail. Objects 1330 
represents optional objects which are system and implemen- 
tation specific. 

40 A Boolean Compound Query Object 350 may contain a 
special set of resultcolumns 1325. These results columns are 
used to express the following user query which is explained 
in two steps: 

1) Determine a set of values for which a set of query 
45 condition holds. 

2) To determine the final result, add the values in the 
specified result columns for which the user query evaluated 
to true. 

A Boolean Compound Query Object 350 has multiple set 

50 and get methods 1340 for the constants, variables and 
objects within itself. Methods 1350 are a set of rendering 
methods which can write for instance the query expression 
and the results to the GUI. It is up to the application what 
information gets explicitly exposed to the user (rendered) or 

55 which values are defaulted. Some of the rendering methods 
can be common to several of objects and be implemented in 
the Base Query Object 220 leading to an efficient imple- 
mentation of the system. Furthermore, a rendering method 
within the Boolean Compound Query Object 350 could use 

60 rendering methods in Typed Compound Query Objects 340. 
Methods 1360 are multiple execute methods. These meth- 
ods allow for a modular and flexible system. However this 
method is not really type specific: it knows how to traverse 
the query expression, invoke the execute methods in the 

65 Typed Compound Query Objects and assemble the results. 
One of the execute methods in the Base Query Object 220 
knows how to take such a structured language string and 
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"package" it correctly so that it can be shipped to a database. 
As a result, the Boolean Compound Query Object 350 does 
not need any knowledge about the communication between 
the application and the database. Conversely, the Base 
Query Object 220 does not need any knowledge about how 
to assemble a Boolean text query and deal with multiple 
indexes. This method is described in more detail in FIG. 16. 
A consequence of this type of a execute method is, that 
execute methods in the Typed Compound Query Objects 
could be chanced without the need of changing the execute 
method in the Boolean Compound Query Object 350. 

The query expression 1310 is a Boolean expression and 
the linearize method 370 in a Base Query Object 320 can be 
used to transform it into a postfix notation or any other 
notation which is suitable for evaluation. This is another 
example of the modular and streamlined architecture sug- 
gested here. 

Methods 1370 are a set of methods to create structured 
query language strings, either in support of the execute 
method or in support of the query language as implemented 
by the GUI. 

FIG. 14 is a flowchart of a typical execute method which 
is a part of Typed Compound Query Object 340. Clearly, the 
specific execute methods are different in each type of Typed 
Compound Query Object 340. It is one of the key features 
of this architecture, that each Query Object can create 
appropriate transformations from itself to either a form 
suitable for the GUI or a form suitable for communication 
with the database. Hence, the specifics of the execute 
method(s) in each of the Typed Compound Query Object 
340 are different and there could be multiple ones, as 
different implementations are possible. However there is a 
certain commonality to these methods which is described in 
this FIG. 14. 

Basically, the method takes two or three inputs as appro- 
priate: the resultcolumns 1410, the query expression 1415 
and the query condition 1420. Each of these inputs gets 
transformed in an appropriate structured query language 
string 1425, 1430, 1435, 1440. These strings are concat- 
enated (with the structured language specific punctuation) to 
form a structured language string which describes the query 
1445 Q. The last step is the "packaging" of Q, as different 
communications protocols could be used between the appli- 
cation and the database. In step 1450 the communications 
and database dependent elements are added to Q and then 
submitted to the database in 1455. In one preferred 
embodiment, the last three steps 1444, 1450 and 1450 are 
implemented as methods in a Base Query Object 220. 
Hence, in case the communications protocol needs to be 
changed, only the method(s) in the Base Query Object 220 
have to be changed, leaving the rest of the implementation 
of an application unchanged. 

The following is some pseudo code, specific implemen- 
tation may vary, 
public void execute( ) { 
QueryList.reset( ); 

linked List component_results=new Linked Lis t( ); 
if (QueryList.size( )==3) { 
Query o=(Query)QueryList.nextElement( ); 
/* its a parenthesis */ 
if (o.whoAmI( ) <6) { 
o-(Query)QueryList.nextElement( ); 
combinedResults=o.executeQuery(sort, dir); 
} 

clse{ 
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System.oul.printlnC'CompoundQuery is ill-formed'*); 
else { 

5 while (QueryLisLhasMoreElements( )) { 
Query o-(Query)QueryList.nextElement( ); 
/* its an operator or a parenthesis */ 
if (o.whoAmI( ) <6) { 
component results.append(o); 

to } 

/* its a query */ 
else{ 

ResultObject r-o.executeQuery(sort, dir); 
if (r— null) 

15 I* creates a ResultObject which identifies itself as 

being null */ 
r-new ResultObject(2); 
component_results. append(r); 

20 y ^ 

I* component_results is a LinkedList of Annotator 

Objects and ResultObjects V 
/* mirroring Query List with the Query Objects replaced 
25 by ResultObjects */ 

/* convert the component_re suits vector from an inFix 

representation into a */ 
/* postFix representation */ 

LinkedList pf__results«Combine. asPostFix(component_ 
30 results); 

/* combine the results into a single ResultObject */ 
combinedResults=Combine.combine(pf_results); 



In one preferred embodiment, the result columns 1410 
contain several pieces of information: for each column, the 
name, its type and the table it belongs to are known. Hence 
40 they can be used to deduct both the column and table 
information for the query. 

/FIG. 15 is a flowchart of a linearize method 371 in FIG. 
3 for the Base Query Object 220 applicable to all Derived 
Query Objects 360, or "Query Objects" for short. The 

45 method 371 transforms Compound Query Expressions (341, 
1025, 1115, 1215, and 1315), for Typed Compound Query 
Objects and Boolean Compound Query Objects into Query 
Expressions (321, 910, 1010, 1110. 1210, 1310) for these 
Derived Query Objects 360. Compound Query Expressions 

so (341, 1025, 1115, 1215, and 1315), are useful for the process 
in FIG. 19 for generating a Boolean Compound Query 
Object 350 from end user input to a Graphical User Interface 
(GUI) 127, and Query Expressions (321, 910, 1010, 1110, 
1210, 1310), which are useful for the method in FIG. 20 for 

55 executing Derived Query Objects 360, as described in the 
method of FIG. 20. 

The Compound Query Expressions 1315 (derived from a 
Compound Query Expression 341) for the Boolean Com- 
pound Query Object 350 in FIG. 13 is itself an instance of 

60 a Boolean Compound Query Object 350, and this instance in 
turn contains a Compound Query Expression 1315, com- 
prising in turn of some Boolean combination of Derived 
Query Objects 360. We can describe this circumstance by 
saying that the Compound Query Expressions (341, 1025, 

65 1115, 1215, and 1315), exemplify "recursion" or "recursive 
nesting" in that Compound Query Expressions (341, 1025, 
1115, 1215, and 1315) contain Derived Query Objects 360 
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each of which in turn contain Compound Query Expressions We can now define the linearize method 371 with flow- 

341, 1025, 1115, 1215, and 1315) which contain Derived chart in FIG. 15 as a method that applies to all Derived 

Query Objects 360 each of which contains a Compound Query Objects 360, including Operator Objects 270, begin- 

Query Expression 341, 1025, 1115, 1215, and 1315) and so ning with the "root" Boolean Compound Query Object 350 

on to any level of recursive nesting. Recursion is well known 5 or Q. The method does two things: 

algorithmic structure in general, however this application of (1) The method 371 transforms the Compound Query 

recursipn.is ncw_ Expression 1315 of this "root" Boolean Compound Query 

FIGS. ISA, 15B arxf 15C sflbw three representations of Object 350 or Q into a Query Expression 1310. The Query 
the same Boolean'query expression, exemplifying the con- Expression 1310 for Q will comprise of Typed Compound 
cepts "recursive nesting," "hierarchical tree", "parent" and to Query Objects 340, Operator Objects 350 connecting pairs 
"child Query Objects. FIG. ISA shows an example of a of Typed Compound Query Objects 340, and Parenthesis 
Boolean query expression using a syntax of Boolean opera- Objects 280 that may group one or more Typed Compound 
tors to connect sub-queries and parentheses to group sub- Query Objects 340. In effect, the method 371 turns the 
queries queries. An end user could type such a query "hierarchical tree" exemplified in FIG. 15C into a linear 
expression into a GUI query element such as a Text Entry 15 expression exemplified in FIG. 15A. 
Input Field, and well known parsing algorithms (not covered (2) The method 371 is applied to the Typed Compound 
in this disclosure) could parse and interpret the text charac- Query Objects 340 in the Compound Query Expression 
ters as query elements. FIG. 15B shows the same query (341, 1025, 1115, 1215, and 1315) for Q and transforms the 
expression with labels such as "1", "2", "2.1", "2.2" etc. Compound Query Expression (341, 1025, 1115, 1215, and 
FIG. 15C shows this same query expression as a hierarchical 20 1315) in each of these "child" Typed Compound Query 
tree of Derived Query Objects 360, using the same labels as Objects 340 into a Query Expression 1315 that in turn can 
in FIG. 15B. Typed Compound Query Objects 340 contain comprise of combinations of "child" Typed Compound 
"child" Derived Query Objects 360: e.g., the Compound Query Objects 340, and/or Typed Elementary Query Objects 
Boolean Text Query Object 310 labeled "2.1" in FIG. 15C 260, Operator Objects 270 connecting pairs of Typed Corn- 
contains the "child" Query Object labeled "2.1.1", and 25 pound Query Objects 340 and/or Typed Elementary Query 
Operator Object with identity "and", and Query Object Objects 260, and Parenthesis Objects 280 that may group 
"2.1.2". Such a parent/child hierarchy is well known in one or more Typed Compound 340 or Typed Elementary 
Object Oriented Programming, however its application here Query Objects 260, such that resulting Query Expression 
is novel. 1315 represents a well-formed and executable Boolean 

More generally, a Derived Query Object 360, or Q can be 30 Compound Query Expression 350. 

defined as the "parent" of the Derived Query Objects In contrast to the recursive Compound Query Expressions 

contained in Compound Query Expression Objects (341, (341, 1025, 1115, 1215, and 1315), the Query Expression 

1025, 1115, 1215, and 1315) of Q. Conversely, the Derived 1315 is a non-recursive structure, where the order of Derived 

Query Objects 360 contained in the Compound Query Query Objects 360 and Annotator Objects 290 is determined 

Expression (341, 1025, 1115, 1215, and 1315) of Q can be 3S by the linearize method to match the order of Derived Query 

called the "children" of the "parent" Query Object Q. In Objects 360 and Annotator Objects 290 in the Compound 

FIG. 15C, the Boolean Query Object 350 numbered "2" is Query Expressions (341, 1025, 1115, 1215, and 1315). 

the "parent" of the Compound Boolean Text Query Object Referring again to FIG. 15, any Query Object Q, Step 

310 numbered "2.1", the Operator Object 270 labeled at the 1520 determines if the Compound Query Expression (341, 

same level of the hierarchical tree, and the Compound 40 1025, 1115, 1215, and 1315) for Q is empty. The Compound 

Parametric Query Object 240 labeled "2.2". The latter three Query Expression (341) would be empty for Typed Elemen- 

Query Objects in turn are the "children" of the Query Object tary Query Objects 260 and Operator Objects 270 because 

labeled "2" in the Figure. We also define the "root" Boolean these Query Objects are elementary, and have no "child" 

Compound Query Object 350 as that Query Object that is Query Objects, only values (e.g., text terms for a Text Atom 

not contained in any other Boolean Compound Query Object 45 Query Object 230, or an identity "left" or "right" for a 

350 or any other Derived Query Object 360 within this Parenthesis Query Object 280). 

Compound Query Expression 1315. In FIG. 15C this "root" If the Compound Query Expression (341, 1025, 1115, 

Boolean Compound Query Object 350 is the topmost Query 1215, and 1315) is not empty the Steps beginning with 1530 

Object in the hierarchical tree of Query Objects, labeled with are executed. Step 1530 determines whether there are any 

the number " 1". 50 Query Objects left. Step 1544 determines whether a Derived 

The preferred implementation of Compound Query Query Object 260, or q in the Compound Query Expression 

Expressions Objects (341, 1025, 1115, 1215, and 1315) is a is Typed Elementary Query Object 260 or an Operator 

Linked List (standard program construct) of query objects, Object 270. If "yes" to either case, then Step 1546 adds q to 

where each object can be a Typed Compound Query Object the Query Expression (341, 1025, 1115, 1215, and 1315) for 

340 or Typed Elementary Query Object 260 or an Operator 55 Q. If the Query Object q is a Typed Compound Query Object 

Object 270. Each of the Typed Compound Query Objects 340, then the Steps beginning with 1552 are executed: Step 

340 in turn has a Compound Query Expression (341, 1025, 1552 sets the parent value of q 1 to the Query Object Q using 

1115, 1215, and 1315) comprising of Linked Lists of set methods 361 in FIG. 3 for the Base Query Object 220. 

Derived Query Objects 360. The Compound Query Expres- (This information will be used in Step 1562 below.) Step 

sions (341) for Typed Elementary Query Objects 260 and 60 1555 applies the linearize method 371 in FIG. 3 to each 

Operator Objects 270 are null or "empty" in common Derived Query Object 360 q 1 in the Compound Query 

programming terms, and we can also say these Query Expression (341, 1025, 1115, 1215, and 1315) which results 

Objects have no "child" Query Objects. (The Compound from applying the linearize method to the Derived Query 

Query Expressions (341) for Typed Elementary Query Object 360, or q in the Compound Query Expression (341, 

Objects 260 and Operator Objects 270 are inherited by the 65 1025, 1115, 1215, and 1315) for the "parent" Derived Query 

Base Query Object 220, in standard Object-Oriented pro- Object 360 Q. Step 1558 adds the Query Object q' to the 

gramming terms). Query Expression 1315 for Q: using the "parent" and 
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"child" terminology above we can say that the "parent" 
Query Object Q adds to its Query Expression 1315 all the 
"children" in each of it's "child" Query Objects q. 

When the steps beginning with 1552 are completed for all 
the "child" Derived Query Objects 360 q in the Compound 
Query Expression (341, 1025, 1115, 1215, and 1315) for the 
"parent" Q, the steps beginning with Step 1565 determine 
the content of the Query Expression 1315 for the "parent" 
Derived Query Object 360 Q. Step 1565 determines whether 
the "parent" Q is a Boolean Compound Query Object 350. 
If "yes" Step 1570 inserts a Parenthesis Object 280 with the 
identity 815 "left" into the Query Expression 1315, and Step 
1574 adds a Parenthesis Object 280 with the identity 815 
"right" into the Query Expression 1315, and Step 1580 exits 
the linearize method. 

If Step 1565 determines whether the "parent" Query 
Object Q is a Boolean Compound Query Object 350, and the 
Step 1562 determines whether the "parent" of the "parent" 
Query Object Q is itself a Boolean Compound Query Object 
350, using the get methods 361 in FIG. 3 for the Base Query 
Object 220. If yes, then Step 1528 sets the Query Expression 
1315 for the Query Object Q to Q itself, rather than the 
"child" Derived Query Objects 360 of Q from Steps begin- 
ning in 1530. Step 1570 inserts a Parenthesis Object 280 
with the identity "left" into the Query Expression 1315, and 
Step 1574 adds a Parenthesis Object 280 with the identity 
"right" into the Query Expression 1315, and Step 1580 exits 
the linearize method. 

The Steps 1562 and 1528 ensure that the Query Expres- 
sion 1315 for the "root" Boolean Compound Query Object 
350, defined as the Boolean Compound Query Object 350 
with no parent itself, only contains Typed Compound Query 
Objects 340, Annotator Objects 290, and no Typed Elemen- 
tary Query Objects 260. 

If Step 1562 determines that the "parent" Query Object Q 
is not itself a Boolean Compound Query Object 350, then 
the Query Expression 1315 for Q is used as derived from the 
Steps beginning 1530 through 1558, and Step 1570 inserts 
a Parenthesis Object 280 with the identity 815 "left" into the 
Query Expression 1315, and Step 1574 adds a Parenthesis 
Object 280 with the identity 815 "right" into the Query 
Expression 1315, and Step 1580 exits the linearize method 
371. 

These Steps ensure that the Compound Query Expres- 
sions (341, 1025, 1115, 1215, and 1315) for all the Typed 
Compound Query Objects 340 (where the latter objects are 
contained in the Compound Query Expressions (341, 1025, 
1115, 1215, and 1315) for the "root" Boolean Compound 
Query Object (350) contain all the "child" Derived Query 
Objects 360 developed as a result of the Steps beginning 
1530. 

When Step 1520 is applied to any Operator Query Object 
270, or to a Typed Elementary Query Object 260, the 
Compound Query Expression for that Query Object is 
empty, in which case Step 1525 adds the Query Object itself 
to its Query Expression 1315, and Step 1580 exits the 
linearize method 371. 

FIG. 16 is a flowchart of a typical execute method which 
is part of a Boolean Compound Query Object 350 or a 
Compound Feature Query Object 330. It applies to any 
Query Object where the query expression is a Boolean 
expression of query objects, each of which having its own 
execute method. 

The input to an execute method 1360 of a Boolean 
Compound Query Object 350 is a Boolean Expression 
comprising of Typed Compound Query Objects 340 and 
Annotator Objects 290. The process of arriving at such a 
Boolean Expression is shown in FIG. 19. 



In one preferred embodiment, each Typed Compound 
Query Object 340 is executed separately and its Result 
Object (FIG. 17) substituted in the Boolean Expression for 
the Typed Compound Query Object 340, resulting in a 
5 Boolean Expression of Result Objects. These Result Objects 
arc then combined according to the rules of the Boolean 
Expression. There are different ways of implementing the 
above described steps and one preferred embodiment is now 
outlined. 

10 The input 1605 to an execute method is a Boolean 
Expression of Typed Compound Query Objects 340 and 
Annotator Objects 290. The next step 1610 tests whether 
there are any components left in the expression. In the 
affirmative case the next object in the Boolean Expression is 

15 checked 1615 to determine whether it is an Annotator Object 
290. An Annotator Object is pushed on the stack S in 1620. 
A Typed Compound Query Object 340 is evaluated in 1625 
by calling its execute method and the resulting Result Object 
(FIG. 17) is pushed on the stack S. Then the Boolean 

20 Expression 1605 is tested again to see whether there are any 
objects left to be evaluated. When there are no more objects 
left, the content of the stack S — which form a Boolean 
expression in prefix format are converted to a Boolean 
Expression PF in postfix format in box 1630. This step is 

25 done with one of many well known algorithms. The result of 
this step is shown in box 1635 — PF — the Boolean expres- 
sion in postfix format. 

The execute method continues by examining each object 
in the Boolean Expression PF in turn. In case there are 

30 objects left in PF, which is checked in 1638, the next object 
is checked whether it is a result object (FIG. 17) in 1640. 
Result Objects are pushed onto a stack T 1645. In case the 
next object is an operator object, some computation is 
performed in 1650: The last two objects are popped from the 

35 stack T 1645, and the operator object is applied between 
these two just popped result object (FIG. 17). Applying the 
operator then results in a new result object which is pushed 
onto stack T. When no more objects are encountered in 1635, 
there is only one object remaining on the stack T, which is 

40 popped from the stack and returned as the result of the 
execute method in 1655. 

FIG. 17 is a block diagram of a Result Object. One key 
feature of this Query Object architecture is that all appro- 
priate Query Objects have the same type of result object. The 

45 result object is database and GUI independent and has a rich 
set of access functions to it. It also contains methods to 
transform into an application specific representation. For 
instance, it could be advantageous to change the names of 
the columns or add a column to make all the result objects 

50 be of the form which is necessary for the execute method in 
a Boolean Compound Query as described in FIG, 16. 

Object 1710 holds the result column headings. Each 
heading contains the name of the column, the table it 
belongs to and its type. All the headings can then be arranged 

55 either in an array or a linked list or another ordered structure. 
The ResultRows 1715 are a two dimensional array contain- 
ing the values of the results. There arc a multitude of 
implementations for such an array and depend on the 
language the application is written. Objects 1720 are 

60 optional and implementation specific. 

There are set and get methods 1725 for all the contained 
objects, variables and constants in the Result Object. 
Method 1730 is a transform method which is application 
specific. This method can for instance change the names in 

65 the columns, change values if appropriate (e.g., change all 
negative values to zero) or any other application specific 
transformation. Invoking method 1735 — NumbcrOfResults 
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returns the number of rows in a particular Result Object. 3) T3 (VIDEO, DATE) AS (SELECT VIDEO, DATE 

Another quite useful method is GetColumn Values 1740 FROM Tl, T2) 

which returns all the values for a specific column. The above expression creates a temporary table T3 which 

The Normalize method 1745 is invoked mainly to accom- contains data which is either in Tl or T2 (i.e., logical 

modate the execute method 1360 in a Boolean Compound 5 OR) — the common table expressions constructed in 1) and 

Query Object 350 described in FIG. 13. All the result objects 2). 

which are input to such an execute method are of the same 4) T4 (VIDEO, DATE) AS (SELECT VIDEO, DATE 

structure— they have the same number of columns which all FROM TABLE3 WHERE RELEASE<1995 UNION ALL 

have the same headings. To achieve that, one or more SELECT VIDEO, DATE FROM T2 

columns and their values may have to be added to a result 10 The above expression creates a temporary table T4 which 

object which is done in the Normalize method 1745. There contains data which is either in TABLES and satisfies the 

can be some optional methods 1750. constraint RELEASE*: 1995 or in T2 (i.e., logical AND) 

The query object architecture is a flexible and expandable The above examples are quite simple and in some cases 

one. At times, it is appropriate to introduce new query may be written more simply. However their importance will 

objects due to the type of data and/or application at hand. IS be apparent in the discussion of FIG. 20 where an execute 

Introducing new query objects allows for instance for the method of a Boolean Compound Query Object is discussed. 

Compound Query Instantiation process to stay unchanged or FIG. 19 is a flow chart of a Compound Query Instantiation 

for some execute methods to be reused. The Common Table process that transforms inputs like text strings originating as 

Expression Query Object as described in FIG. 18 is a query elements of a Graphical User Interface (GUI) into an 

objectintroducedtofacilitatequeryexpressionevaluation.lt 20 instance of a Compound Query Expression 1315 for a 

enables the algorithm for traversing a Boolean expression Boolean Compound Query Object 350. This Compound 

and taking into account precedence of operators to stay Query Expression (341, 1025, 1115, 1215, and 1315) in turn 

unchanged and at the same time create a structured language contains one instance of a "root" Boolean Compound Query 

string which is much faster to evaluate. Object 350, and this Query Object contains a Compound 

FIG. 18 is a block diagram of a Common Table Expres- 25 Query Expression 1315, which contains a collection of 

sion Query Object. This object is a good example of the Derived Query Objects 360 comprising of Typed Elemen- 

flexibility and expandability of our architecture. Two basic tary Query Objects 260, Typed Compound Query Objects 

concepts are: a Boolean expression which represents a user 340, and Operator Query Objects 270 in combinations that 

query is composed of Query Objects. In turn, each Query represent a well- formed Query Expression 1310 when the 

Objects has a method to translate the query it represents into 30 method of FIG. 15 is applied to the "root" Boolean Com- 

a database specific query language. The Common Table pound Query Object 350. 

Expression Query Object is a Query Object which contains Typed Compound Query Objects 340 contained in Com- 

a specific translation of a query or sub-query into SQL which pound Query Expression (341, 1025, 1115, 1215, and 1315) 

is optimized for performance. can be termed for convenience "child" Query Objects of the 

The Common Table Expression Query Object contains at 35 Query Object whose Compound Query Expression (341, 

least two objects: the name object 1810 and the SqlString 1025, 1115, 1215, and 1315) contains them. For 

object 1820. The name is generated using method 1860. One convenience, we can also define the Query Object Q whose 

preferred embodiment of the name generation is to use a Compound Query Expression contains these "child" Query 

fixed textstring and append a number to it. The application Objects, as the "parent" of those Query Objects contained in 

keeps track of the numbers used so far. The SqlString in 40 the Compound Query Expression (341, 1025, 1115, 1215, 

general is created during an execute method of a different and 1315) of Q. The definitions are the same as those defined 

object which uses a Common Table Expression Object to for the linearize process of FIG. 15, and FIG. 15C shows an 

store intermediate results. example of a Compound Boolean Query Object 350 with 

Methods 1840 are set/get methods for the objects, vari- "child" Query Objects displayed in a hierarchical tree, where 

ables and constants used in the Common Table Expression 45 the "child" Query Objects are in turn "parents" of lower 

Object. level "child" Query Objects. 

Objects 1830 arc optional objects which are implemen- The Compound Query Instantiation process in FIG. 19 is 

tation and application specific as are the optional methods initiated by some kind of Graphical User Interface (GUI) 

1860. event, e.g., a preferred implementation corresponds to the 

The following examples are written in SQL and show 50 end user pressing a "Search" button in the GUI. The process 

examples of strings as they may appear in Object 1820. comprises writing code that includes the following the steps 

1) Tl (VIDEO, DATE) AS (SELECT VIDEO, DATE (1) through (4), embodied in program code: 

FROM TABLE 1 WHERE TOPIC- HAPPINESS' AND (1) Program code is written in such a way that each GUI 

LOCATION='NEW YORK') query element and program code unit can be associated with 

The above expression augmented with the keyword 55 one Derived Query Object 360 in the Boolean Compound 

WITH creates a temporary table Tl which has two columns Query Object 350 expressed by the GUI query elements 

VIDEO and DATE. These two columns have values taken taken as a whole. 

from TABLE1 which contains the columns TOPIC and (2) Program code is written in such a way that each GUI 

LOCATION and satisfy the two constraints TOPIC- query element can be evaluated as a "valid" query element 

'HAPPINESS' and LOCATION-' NEW YORK' 60 that expresses an end user's intention to use that GUI query 

2) T2 (VIDEO, DATE) AS (SELECT VIDEO, DATE element to represent a particular type of Query Object. 
FROM TABLE2 WHERE ANCHOR-'DAN RATHER') (3) Program code is written in such a way that it produces 

The above expression augmented with the keyword valid Compound Query Expressions (341, 1025, 1115, 1215, 

WITH creates a temporary table T2 which has two columns and 1315), adding "child" Query Objects to "parent" Typed 

VIDEO and DATE. These two columns have values taken 65 Compound Query Objects 340 in a way that expresses a 

from TABLE2 which contains the columns ANCHOR and well-formed Boolean query expression, including Operator 

which satisfy the constraint ANCHOR-'DAN RATHER'. Query Objects 270 connecting pairs of Query Objects. 
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(4) Program code is written in such a way that the GUI intention to include the Query Object associated with GUI 
query elements and program code units which use and query element an a component of an overall Boolean Corn- 
operate on these GUI query elements, resulting from these pound Query Object 350. A GUI query element may also be 
process steps (1), (2), (3) and (4) taken in their totality, and valid by virtue of program logic that relates two or more 
executed as a program (with end user input to the GUI query 5 GUI query elements. For example, a user may specify the 
elements) will instantiate a Boolean Compound Query value of an GUI query element representing an Operator 
Object 350 that comprises of Boolean combinations of Query Object 270, but this action only makes the query 
Derived Query Objects 360 including Typed Elementary element a valid representation for the Operator Query Object 
Query Objects 260, Typed Compound Query Objects 340, 270 if the user also specifies (makes "valid") values for GUI 
and Operator Query Objects 270 connecting pairs of these 10 query elements that are associated with Query Objects 
Query Objects. The process of instantiation is described by which would be connected by the Operator Object 270 in 
the flowchart in FIG. 19. question. Standard programming practices can readily 

In assumption (1) a GUI query element and/or program implement such program relations among GUI query de- 
code unit can be associated specifically with an Typed ments. 

Elementary Query Object 260 (e.g., a Text Atom Query 15 In assumption (3), program logic is written such that 

Object 230, an Parametric Attribute Query Object 240, or a "valid" GUI query elements arc used to set values of 

Feature Atom Query Object 250), an Operator Query Object Derived Query Objects 360 in FIG. 2, using set methods 361 

270, or a Typed Compound Query Object 340 (e.g., a in the Base Query Object 220, and program logic uses add 

Compound Boolean Text Query Object 310, a Compound methods 351 in the Base Query Object 220, to create 

Parametric Query Object 220, or a Compound Feature 20 combinations of Derived Query Objects 360 connected by 

Query Object 330). GUI query elements and program code Operator Objects, and contained in the Compound Query 

units can relate4o-QueryJDbjects of any type. Expressions (341, 1025, 1115, 1215, and 1315) of Typed 

FIGS.-19A-and-19B^show examples of GUI query Compound Query Objects 340, which in turn may be 

elements," with user input, and the descriptors with arrows connected by Operator Query Objects 270, and which make 

pointing to the GUI query elements label these elements in 25 up the content of a "root" Boolean Compound Query Object 

conjunction with pseudocode described below. FIG. 19^ 350. 

shows a form with GUI query elements representing text- The GUI query elements can be defined and/or how 
query criteria, parametric criteria, an d fe ature criteria (see program code can be written in any general way to satisfy 
labels in figure). Select GUI query elements are also labeled assumptions (1), (2), (3), or (4). However, the GUI allows 
with program names- used- in- the~pseudo-code below: e.g.j 30 end users to specify a valid Boolean Compound Query 
the "textAtomlnputl" GUI query element is a standard GUI Object 350 composed of Derived Query Objects 360 con- 
text entry field into which a user has typed "Bill Clinton" as nected by Operator Query Objects 270 and must satisfy the 
a text query term. These text terms "Bill" and "Clinton" are five assumptions in the prior paragraph. The programming 
extracted by program code, and used to set the value.ofja process fulfills the assumptions (1) through (4), and results 

single Text Atom Query Object 230 ("Bill Clinton"): ' 35 in the generation of a "root" Boolean Compound Query 

There may be multiple of these GUI query elements Object 350, and all the Derived Query Objects 360 con- 
expressing Text Atom Query Objects 230 of a Boolean Text tained in its Compound Query Expression (341, 1025, 1U5, 
Query Object 31, as FIG. 19Ashows. Additional GUI query 1215, and 1315). 

elements allow end users to specify the value of Operator Once GUI query elements and program code are written 

Query Objects 270 expressing Boolean operators connecting 40 to satisfy these assumptions (lM 5 )* me process in FIG. 19 

these Text Atom Query Objects 230. Hence GUT query will "instantiate" a Boolean Compound Query Object 350, 

elements can express groupings that correspond to various and all the Derived Query Objects 360 that are contained in 

types of Typed Compound Query Objects 340. its Compound Query Expression (341, 1025, 1115, 1215, 

Alternatively, FIG. 19B shows a single GUI text entry and 1315), as intended by the end user. m 

field may contains a complex Boolean query expression 45 In toe preferred implementation, standard programming 

multiple component query criteria connected by Boolean methods are used to associate GUI query elements and 

operators, in which case this single GUI query element program code units with instances of Derived Query Objects 

actually resolves into a Boolean Compound Query Object 360. The end user expresses his or her intentions by inter- 

350 of any complexity as implied in the end user's typed acting with the GUI query elements to specify values of the 

query expression. FIG. 19B in particular shows a GUI query 50 Query Objects associated with the GUI query elements, 

element into which an end user types a Boolean query Instantiating for Typed Elementary Query Objects 260 

expression which is parsed by program code (this parsing means GUI query element values are used to set Query 

process is not covered by this disclosure) to create the Object values using set methods 361 in the Base Query 

Boolean Compound Query Object 350 that was also shown Object 220. Instantiation for Typed Compound Query 

in FIG. 15A. ss Objects 340 means that "child" Derived Query Objects 360 

Finally, there may not be a GUI query element for a Query are added to the Compound Query Expressions (341, 1025, 

Object, but a program code unit that creates a Query Object 1115, 1215, and 1315) for these Typed Compound Query 

by programmer stipulation: e.g., there may be no explicit Objects 360, using add methods 351 in the Base Query 

GUI query element for specifying a Boolean operator, but Object 220. The result Boolean Compound Query Object 

there is a program code unit that creates an Operator Query 60 350 provides the Compound Query Expression (341, 1025, 

Object 270 with a stipulated value (also called a "default" 1115, 1215, and 1315) needed for the other methods and 

value in programming terms). These methods are illustrated processes covered by this disclosure. The following pseudo- 

with pseudo-code below. code illustrates this process of instantiation and adding of 

In assumption (2), a "valid" GUI query element is simply Derived Query Objects 360 to Typed Compound Query 

one that an end user has acted on and in so doing has 6S Objects 340. The pseudo-code immediately below provides 

specified a value (e.g., typed in text terms representing an example of how GUI query elements are related to 

keywords for a text query), and thereby expressed the instantiation of Derived Query Objects 360: 
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1) textAtomlnputl-new TextEntryField( ); 

2) textOperatorlonew TextEntryFicld( ); 

3) textAtomInput2=new TextEntryFicld( ); 

4) textOperator2-new TextEntryField( ); 

5) textAtomInput3=new TextEntryField( ); 
6) 

7) textOperator3=new TextEntryFieId( ); 
8) 

9) textAtomInput4-new TextEntryField( ); 

10) textOperator4«new TextEntiyField( ); 

11) tcxtAtomInput5=new TextEotryFicld( ); 

12) textOperator5-aew TextEntryField( ); 

13) textAtomlnputfi-new TexlEntryField( ); 
14) 

15) compoundOperatorloncw TcxtEntryFicld( ) 
16) 

17) attributeAtomNamelnpull-new TextEnlryField( ); 

18) attributeOperatorl=new TextEntryField( ); 

19) attributeAtomValueInputl=new TextEntryField( ); 
20) 

21) attributeOperator2-new TextEntryField( ); 
22) 

23) attributeAtomNameInput2=new TextEntryField( ); 

24) attributeOperator3=new TextEntryField( ); 

25) attributeAtomValueInput2=new TextFmryField( ); 
26) 

27) compoundOperator2=new TextEntryFicld( ); 
28) 

29) feature AtomNameInputl=new TextEntryField( ); 

30) featureOperatorl-new TextEntryField( ) 

31) featureAtomValuelnputl-new TextEntryField( ); 
32) 

33) featureOperator2-Dew TextEntryField( ) 
34) 

35) featureAtomNameInput2=new TextEntryField( ); 

36) featureOperator3=new TextEntryField( ) 

37) featureAtomValueInput2=new TextEntryFicld( ); 
38) 

39) if (tcxtAtomInpua.isValid( ) & tcxtAtomInput2.isValid( 
)){ 



58) booleanTextCompound2=new Boole anTextCompound( 

); 

59) booleanTextCompoundl .add(textAtom4); 

60) booleanTextCompound2.add(textOperator3); 
5 61) booleanTextCompound2.add(textAtom5); 

62) if (textAtomIoput6.isVahd( )) { 

63) textAtom6=new TcxtAtom(textAtomInput6.getTerms( 

)); 

64) textOperator4-new Operator(textOperator3.getIdentity( 
io )); 

65) booleanTextCompoundl .add(textOperator4); 

66) booleanTextCompoundl. add(textAtom6); 

67) } 

68) } 

69) if (booleanTextCompoundl. basContents( ) & 
booleanTextCompouod2.hasContents( )) { 

70) boolcanTcxtCompound3-new BooieanTextCompound( 

); 

71) booleanCompound-new BooleanCompound( ); 

72) booleanCompound.add(booleanTextCompound3); 

73) booleanCompound.add(new Operator("AND")); 

74) booleanCompound.add(parametricCompoundl); 

75) } 
76) 

25 77) etc. for Parametric attributes, Features, etc. 

In the pseudo-code example, a set of GUI query elements 
are defined in lines 1-38 and given names identifying the 
type of Query Object created from the query element This 
is standard programming practice. These lines of code can 
30 also be viewed in relation to the GUI example in FIG. 19A: 
e.g., the GUI query element labeled "textAtomlnputl" in 
line 1 of the pseudo-code above is shown in FIG. 19A. Lines 
1, 3 and 5 of the pseudo-code corresponds to the GUI query 
input elements in FIG. 19 A containing the name "Bill 
35 Clinton", "visit" and "China". Note the not all the GUI query 
elements listed in the pscudo-codc arc labeled in FIG. 19A, 
but just a few to suggest the correspondence. The lines 
39-77 define blocks of code that test if the GUI query 
elements related to the Text Atom Query Objects 230 are 



15 



20 



40) textAtoml-new TextAtom(textAtomInputl.getTerms( 40 "valid" and creates Derived Query Objects 360, including 



)); 

41) textAtom2-new TextAtom(textAtomInput2.getTerms( 

)); 

42) textOperatorl=new Operator(textOperatorl.getIdentity( 

)); « 

43) booleanTextCompoundloncw BoolcanTextCompound( 

); 

44) booleanTextCompoundl.add(tcxtAtoml); 

45) booleanTextCompoundl. add(textOperatorl); 

46) booleanTextCompoundl .add(textAtom2); 50 

47) if (textAtomInput3.isValid( )) { 

48) textAtom3=new TextAtom(textAtomInput3.getTerms( 

)); 

49) textOperalor3=new Operator(textOperator3.getIdentity( 

)); 

50) booleanTextCompoundl. add(textOperator2); 

51) booleanTextCompoundl.add(textAtom3); 

52) } 

53) } 

54) if (textAtomInput4.isValid( ) & textAtomInput5.isVa!id( 
)){ 

55) textAtom4=new TextAtom(textAtomInputl.getTerms( 



)); 

56) textAtom5=new TextAtom(textAtomInput2.getTerms( 

)); 

57) textOperator3=ncw Operator(textOperatorl.getIdentity( 

)); 



Compound Boolean Text Query Objects 310. "Valid" means 
the user has typed terms in the input fields, e.g., "Bill 
Clinton", vs leaving the text entry field blank. Lines 39-53 
for example test if three text entry fields named "textAtom- 
lnputl" and "textAtomInput2" and textAtomInput3 are 
"valid." In FIG. 19A, this means the end user has entered 
text terms "Bill Clinton", "visit" and "China". The end user 
may or may not have specified the "&" operators connected 
these terms. If the two input fields are "valid" than the block 
of code is executed: the code in lines 40-42 create two Text 
Atom Query Objects 230, and an Operator Object 270, and 
the lines 43-46 create a Compound Boolean Text Query 
Object 310, and adds the Text Atom Query Objects 230 and 
Operator Object 270 to the this Query Object, using the 
55 "add" method 351 in the Base Query Object 220. The "add" 
method adds the "child" Query Objects to the Compound 
Query Expression 1025 of the Compound Boolean Text 
Query Object. 

The lines of code 69-75 create another Compound Bool- 
ean Text Query Object 310. If the Compound Boolean Text 
Query Objects resulting from lines of code 39-53 and 54-68 
have contents, the lines 69-75 create yet a third Compound 
Boolean Text Query Object 310, and adds the Compound 
Boolean Text Query Objects 310 from lines of code 39-53 
and 54-68 as "child" Query Objects, connected by an 
Operator Object. Additional lines of code of similar form 
could be written for Parametric Attribute Query Objects 240 



60 



06/04/2004, EAST Version: 1.4.1 



US 6,3 

33 

and Feature Atom Query Objects 250, and their correspond- 
ing Compound Parametric Query Objects 320 and Com- 
pound Feature Query Objects 230. 

The pseudo-code illustrates all the steps (l>-( 4 ) above: 
how GUI query elements are used to instantiate Derived 
Query Objects 360, how Operator Objects 270 can become 
"valid" by virtue of the "validity" of other Query Objects, 
and can have "default" values stipulated by program code, 
and not necessarily end user input, and how one Derived 
Query Object 270 can be added to a Typed Compound 
Query Object 340. 

We define a special case of the Boolean Compound Query 
Object 350 resulting from program execution, called a 
"GUI-Complete" Boolean Compound Query Object 350: 
The "GUI-Complete" Boolean Compound Query Object 
350 is the Query Object that would result if an end user 
interacted with the GUI to make every GUI query element 
"valid". It is therefore the most complete theoretically 
possible Boolean Compound Query Object 350 that could 
result from executing the program code according to the 
steps (1), (2) and (3). 

We can now preview the process in FIG. 19 as one that 
"instantiates" an actual Boolean Compound Query 350, 
starting from this ideal "GUI-Complete" Boolean Com- 
pound Query Object 350. The process instantiates a set of 
Derived Query Objects 360 that correspond to 'Valid" GUI 
query elements, and/or program code that combines Query 
Objects into Typed Compound Query Objects 340. All these 
Query Objects are ultimately contained in a Compound 
Query Expression (341, 1025, 1115, 1215, and 1315) of a 
"parent" Boolean Compound Query Object 350. Another 
way to describe this is to say that the "GUI-complete" 
Boolean Compound Query Object 350 is used as a template 
or skeleton, to guide the process of Compound Query 
Instantiation depicted in FIG. 19, in the sense that process 
depicted in FIG. 19 examines each Query Object in the 
GUI-Complete Boolean Compound Query Object 351, and 
if the GUI query element associated with it is "valid", a new 
Query Object Q*""* of the same type is created, its values 
set using set methods 361 for the Base Query Object 220, 
and using the data contained in the "valid" GUI query 
element, and 0***"'' is added to the Compound Query 
Expression 341, 1025, 1115, 1215, and 1315) of the "parent" 
Query Object for Q*""* using add methods 351 for the Base 
Query Object 220. The result of the process in FIG. 19 
therefore is a new Boolean Query Object 350 built out of 
Query Objects selected from, and instantiated (values set) 
from the corresponding Query Objects of the GUI-Complete 
Boolean Compound Query Object 3500. The order and 
grouping of "child" Query Objects in this final resulting 
Boolean Compound Query Object 350 is strictly parallel to 
order and grouping of the GUI-Complete Boolean Com- 
pound Query Object 3500, except for Query Objects that do 
appear in the resulting Boolean Compound Query Object 
350 because the GUI query elements associated with them 
were not "valid". 

The process applied to any given Query Object Q CUT - 
compteu at any level in this recursive nested structure begins 
with Step 1905 which creates a new but uninstantiated 
Query Object Q'"" 7 ' of exactly the same type. Uninstanti- 
ated means that if Q"""* is a Typed Elementary Query 
Object 260 it has no value (get method of 351 returns "null" 
or "empty" in common programming terms), or if Q™*"'' is 
a Typed Compound Query Object 340 its Compound Query 
Expression (341, 1025, 1115, 1215, and 1315) is empty (has 
no "child" Query Objects). Step 1910 determines whether 
the Query Object ^fui- Comp u tt fc % Typcd Elcmcntary 
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Query Object 260 or a Typed Compound Query Object 340. 
If the Query Object qG^^'" ^ either an Typed Elemen- 
tary Query Object 260 or an Operator Query Object 270, 
Step 1962 determines whether the Query Object has a 

5 "valid" GUI query element associated with it. If the GUI 
Query Element is "valid" (i.e., an end user has specified a 
value for it), Step 1966 sets the value of Q" M using set 
methods 361 in the Base Query Object 220, and'Step:1990^ 
^returnsJhT Q"^. For example a GUI query element such as 

1Q a text emry~field would be "valid" if an end user typed text 
keywords such as "Bill Clinton", and the typed text "Bill 
Clinton" would be used to set the value of a Text Atom 
Query Object 230. Conversely, if the GUI query element is 
not "valid", the Query Object Of*™* is still returned in Step 
1990 but it is has no value or is "empty". 

15 Returning to Step 1910, if Step 1910 determines that the 
Compound Query Expression (341, 1025, 1115, 1215, and 
1315) of the Query Object Q<™-c°*v''» i & a Typed Com- 
pound Query Object 340, Step 1920 determines if there are 
any Query Objects in its Compound Query Expression (341, 

20 1025, 1115, 1215, and 1315) left to process. If there is a 
Query Object q to process, Step 1930 applies the Compound 
Query Instantiation process to q, and returns the result q"**"'' 
to Step 1940. Step 1940 determines whether the Query is an 
Typed Elementary Query Object 260, or an Operator Query 

25 Object 270. If "yes" then Step 1950 determines whether the 
Query Object qf*""" has a value or not, using the get 
methods 361 in the Base Query Object 220. If the GUI 
Query Element has a value, Step 1956 adds the q*""'' to the 
Compound Query Expression (341, 1025, 1115, 1215, and 

30 1315) of Q*"**''. 

If Step 1940 determines that a Query Object q**™'' in the 
Compound Query Expression (341, 1025, 1115, 1215, and 
1315) of the "parent" Query Object Q is a Typed Compound 
Query Object 360, Step 1942 determines whether the Com- 

35 pound Query Expression (341, 1025, 1115, 1215, and 1315) 
for (f esutt has "child" Query Objects in its Compound Query 
Expresssion (341, 1025, 1115, 1215, and 1315). If "yes", 
(f suit is added to the Compound Query Expression (341, 
1025, 1115, 1215, and 1315) for Q*""*, which is the 

40 "parent" for q*""", using the add methods 351 in the Base 
Query Object 220. Q** 5 **" i s then returned in step 1990. 

When the process in FIG. 19 is completed there exists a 
new result "root" Boolean Compound Query Object 350, 
which contains a Compound Query Expression (341, 1025, 

45 1115, 1215, and 1315) containing Typed Compound Query 
Objects 340, which in turn contain Compound Query 
Expressions (341, 1025, 1115, 1215, and 1315) which con- 
tain Derived Query Objects 360, which themselves contain 
Compound Query Expressions 360, and so on in a "recursive 

50 nested" manner, the totality of which expresses a well- 
formed Boolean Compound Query Object 350 where "well- 
formed" means the Query Object can be linearized using the 
method 371, corresponding to the flowchart in FIG. 15, and 
executed using methods defined in 375 of the Base Query 

55 Object 220, and derived execute methods for each of the 
Derived Query Objects 360. 

The algorithm described in FIG. 20 is related to the 
disclosure of U.S. patent application Ser. No. 09/106,968 by 
Brereton, Coden and Schwartz, filed Jun. 30, 1998 and 

60 entitled "Method and System for Translating an Ad-Hoc 
Query to Structured Query Language Using Common Table 
Expressions" which is hereby incorporated by reference in 
its entirety. 

The present invention is now described in detail in the 
65£description of FIGS. 20-23 below. 

FIG.,20-is a flowchart illustrating a high performance 
execute method in a Boolean Compound Query Object 350 
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or a Compound Feature Query Object 330. It applies to any 260, Annotator Objects 290, Typed Compound Query 

Query Object where the query expression is a Boolean Objects 340 and Boolean Compound Query Objects 350. All 

expression of query objects each of which having its own these objects with the exception of the Base Query Objects 

execute method. are Derived Query Objects 360 which have a certain com- 

The same type of execute method could be also used in 5 monality being: 

other Query Objects like a Feature Query Object 330 where l) Each of the Derived Query Objects 360 may describe 

the Query Expression 1210 is a Boolean Expression of a part (or all) of a user query. 

Feature Atom Query Objects 250 and Annotator Query 2 ) Each of the Derived Query Objects 360 may contain 

Objects 290. oae or more methods ( 0 translate a query or sub-query into 

In general, a Query Expression 1310 in a Boolean Com- to a structured Query Language like SQL. 

pound Qu^ Object 3j» is an expression of Typed Com- 3) ABo olean Compound Query Object 350 describes the 

poundQueryObjec^40andAnnotatorObjects290wntten ^ ^ * ^ J ^ ion 

in infix notation. The input is a Que^Express.oo 1310 flf £ er 4 d ^ objects 360 . 

written m infix notation as shown id 2005. The first step is „ „ , „ . ^ . 

to convert this expression into one written in postfix notation is 4 ) ^mpound Query Object 350 may contain 

denoted as PF_Q as shown in 2010. This step is well °™° T m ™ m u ct ^ ° u n how t0 * c wb-qucnes 

documented in the literature. J? 10 ™ described by the instances of the Denved Query 

The next few steps are repeated for as long as there are 0b i ects u 360 m * JJ* ^ uei 7 whKh u P° n executlon 

any elements left in the query expression as will be detailed returns ,he resul1 1320 for ^ ^ <J uer * 

now. In 2015 it is checked whether there are any objects left 20 5 ) A Boolean Compound Query Object 350 may contain 

in the expression PF_Q. If there are any objects, the top one or more methods on how to translate the sub-queries 

TOP__0 is popped in 2020. In 2025 TOP_0 is examined 1310 mto sub-queries queries which upon execution return 

and it is put on stack TEMP in 2030 if it is a Derived Query results which can be combined with results returned from 

Object 360. In case TO_0 is an Operator Object step 2035 me execution of other sub-queries queries 1310. 

is executed: the two top objects 01 and 02 are popped from 25 6) A Boolean Compound Query Object 350 may contain 

the stack TEMP. Depending on the type of 01 and 02 the a set of resultcolumns 1325 which represents additional 

algorithm proceeds to one of the four boxes: 2040, 2045, information required for the final user query. 

2050, 2055. The decision to which of the four branches to The key points are now elaborated and examples are 

take is based on whether 01 and 02 are either both Query given to illustrate the methods and points but not limit the 

Objects or CTE Objects, or one of each (there are two boxes 30 application of the key points. For the examples, it is assumed 

for this case depending on which type of object was popped that the database has several tables Tl, T2 through Tn. Each 

first. All the boxes 2040, 2045, 2050 and 2055 perform the table contains several columns. Each table contains three 

same type of operation: they create an structured query columns named ID, START and STOP, 

language string representing the query expressed in objects 1) A Derived Query Object 360 Describes a Part (or All) of 

01 and 02 and the operator object TOP_0. The string is put 35 a User Query 

into a newly created CTE Object— CTE#— in 2060 which is Example: The user query is: return the ID, START and 

added to a chain of already created CTE objects in 2065. STOP of all records for which the film producer is HJTCH- 

Furthermore CTE is pushed on the stack TEMP 2030. Then COCK For this example it is assumed that there is a table 

the algorithm proceeds to 2015 to check whether there are which has also a column named PRODUCER. The Query 

any more objects left in the Boolean expression PF_Q. 40 Object comprises of a list of result columns: ID, START, 

When the last object was encountered, the CTE chain as STOP which are fully qualified to uniquely identify the 

created in 2065 is processed. First a string is instantiated table(s) from which the records are to be chosen, a list of 

with the keyword "WITH" (in SQL or a comparable word in conditions which are to be satisfied (e.g. PRODUCER- 

another similar structured query language). In the next step HITCHCOCK) and a method createSQL which translates 

it is tested whether the CTE chain is empty. If it is not empty, 45 the user query into a SQL statement. A Query Object 

the sqlstring which is part of the next CTE Object in the CTE contains also other methods which are helpful in writing 

chain is appended to the string s with the appropriate (for the applications using them, e.g. a method to execute this query 

structured query language) punctuation in step 2080. When against a database and methods which transform the results 

the last CTE Object is evaluated, the algorithm proceeds to into a form convenient for the user interface. There could be 

2085. 50 many different implementations of a query object, however 

A Boolean Compound Query Object 350 may contain a this is the preferred method, 

special set of resultcolumns 1325. These results columns are 2) A Derived Query Object 360 Contains One or More 

used to express the following user query which is explained Methods to Translate a Query Into SQL 

in two steps: Example: For the Query Object described in key point 1 

1) Determine a set of values for which a set of query 55 a method createSQL would return the following SQL state- 
condition holds. ment: 

2) To determine the final result, add the values in the 

specified result columns for which the user query evaluated scl ^., slAKr ' *** 11 wherc PRODUCER-* hitch- 
to true. 

In step 2085 the appropriate SELECT statement is created 60 Second (more complex) Example: If the user query is: return 

which expresses which additional resultcolumns the user the ID, START, STOP from Tl where TITLE- 'SKYDIVE', 

wants to see in the final result. and TITLE is flagged to be a text extender field with handle 

"titlehandle" where the Text Extender High Performance 
Query is to be used, the following can be generated by the 

FIGS. 3-18 described the various non-limiting examples 65 translate function. Note that there are "internal" common 

of the Object Query Architecture and its implementation of table expressions used. The entire sequence is tagged with a 

Base Query Objects 220, Typed Elementary Query Objects unique instance identifier (in this case, the "4" after 
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REPHANDLE, ROWRESULTUST, and MATCHTABLE) 
to distinguish the Common Table Expressions from others 
that may have been generated. 
WITH REPHANDLE4(MYDOCHANDLE) AS 

SELECT DB2TX.DB2TEXTH(prototypehandle) 

FROM DB2TX.TEXTC0LUMNS 

WHERE TABLESCHEMA-instanceName AND 

TABLENAME-'Tl' AND 

CO LUMNNAME- 4 TITLE ' 

) 

ROWESULTLIST4(RESULTDOCLIST) AS 
( 

SELECT DB2TX.HANDLE_LIST(MYDOCHANDLE, 
"'SKYDIVE'") 
FROM REPHANDLE4 

) 

MATCHTABLE4(handle, RESULTDOCLIST, cardinality, 
number) AS 

(SELECT db2tx.bandle(RESULTDOCLIST, 1), 
RESULTDOCLIST, 

db2tx.no_of_documents(RESULTDOCLIST), 
1 

FROM R0WRESULTLIST4 
WHERE db2tx.no_of_documents 
(RESULTDOCLIST)>0 
UNION ALL 

SELECT db2tx.handle(RESULTDOCLIST.number+ 
1), 

RESULTDOCLIST, 

cardinality, 

number+1 
FROM MATCHTABLE4 
WHERE number cardinality 

) 

QOflD, START, STOP) AS 

(SELECT id, start, stop 

FROM Tl 

,MATCHTABLE4 

WHERE HANDLEotitlebandlc 

) 

3) A Boolean Compound Query Object 350 Describes the 
Entire User Query 

Example: The user query is: return the ID, START and 
STOP of all records for which the film producer is HITCH- 
COCK and in which the word MURDER is spoken. Return 
the records rank ordered. In this example table T2 has a 
column in which the text is recorded and which can be 
searched by a text search engine (e.g., DB2 TextExtenders). 
The Boolean Compound Query Object 350 contains a list of 
Typed Compound Query Objects 340 and Operator Objects 
270: Ql (Query Object IX OP (Operator Object), Q2 (Query 
Object 2). Ql encapsulates the first sub-query (see example 
under key point 1), OP denotes and AND operator, Q2 
encapsulates the second query which contains a method to 
translate it into the following SQL statement: 

select ID, START, STOP, DB2TX.RANK<handle, 'MURDER*) 
from T2 

Id the above example, handle is the name of the column 
used for storing the index for the text search engine. In key 



point 4 it is shown on how to interpret the AND operator and 
in key point 5 it is shown bow to normalize the result sets 
of Ql and Q2 to be able to apply the AND operator. 
4) A Boolean Compound Query Object 350 Contains One or 

5 More Methods on How to Translate the Sub-queries Queries 
Into a Single Query Which Upon Execution Returns the 
Result Set for the Original User Query 

Example: The AND operator is defined in this step. A 
Boolean Compound Query Object 350 could have many 

10 methods each of which interpret the AND operator differ- 
ently. One possible definition of an AND operator is that it 
applies to the time intervals as defined by START and STOP 
in the tables: let Rl be the result set of sub-query Ql and R2 
be the result set of sub-query Q2. Then a record is in R, the 

15 result set of the Compound Query, if for an ID which is in 
Rl and in R2 there is a non empty time interval which is the 
intersection of a time interval in Rl and in R2. The resulting 
SQL expression is formulated as follows: 
SELECT ID, MAX(S1.START, S2 .START), MIN(Sl.STOP, 

20 S2.STOP) FROM 
SI, S2 WHERE 

S1.ID»S2.ID AND (Sl.START<S2.STOP) AND 
(S2.STOP<S1.STOP) 
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MAX function is implemented as 
CASE START 

WHEN S1.START>-S2.START THEN Sl.START 

ELSE S2 .START END 

MIN function is implemented as 
CASE STOP 

WHEN Sl.STOP>-S2.STOP THEN S2.STOP 

ELSE S2.STOP END 



The tables SI and S2 are determined as DB2 Common 
Table Expressions and then combined with the above 
expression 
WITH SI (ID, START, STOP) AS 
40 (SELECT (ID, START, STOP) FROM Tl WHERE 
PRODUCER-HITCHCOCK), 
S2 (ID, START, STOP) AS 

(SELECTED, START, STOP. DB2TX.RANK(handle, 
'"MURDER"')) FROM T2 
45 SELECT ID, MAX(S1.START, S2. START), MIN 
(Sl.STOP. S2.STOP) 

FROM SI, S2 WHERE S1.ID-S2.ID AND 

(Sl.START<S2.STOP) AND 
(S2.START<Sl.STOP) 

50 

However the above expression is not quite correct as the 
Common Table Expressions SI and S2 contain different 
number of columns. In key point 5 it is illustrated on how to 
normalize the result sets of the two sub-queries queries to 

55 make the above expression a correct SQL statements. 

The above examples illustrates on how an Operator 
Object could be interpreted. Now the general method for 
creating a single SQL statement from a Compound Query is 
illustrated The Compound Query contains a list of Query 

60 Objects, Operator Objects and Parenthesis Objects which 
form a Boolean expression in infix notation. The first step is 
to translate the Boolean expression from infix notation to 
post fix notation, an algorithm which is extensively 
described in the literature. An important part of such an 

65 algorithm is to define the precedence between operators (the 
order in which operators are evaluated). The algorithm used 
here, takes a precedence function as an input, in other words 
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the application can define it, as the meaning of AND, OR 
(for example) are overloaded. After translating the Boolean 
expression into post fix notation, the Operator Objects and 
Query Objects are in a list, ready for evaluation. 

Do while there are Objects in the list: 
If the object is a Query Object ->push it onto the stack 
Else { 

current object is Operator 
pop the stack ->Object 1 
pop the stack ->Object 2 

CTE Object=createOperatorExpression(Objectl, 

Object2, Operator) 
push CTE Object onto stack 
add CTE object to Vector v 

} 

} 

Each CTE object contains a SQL string which denotes the 
common table expression describing the query as defined 
between Objectl, Object2 and the Operator. The above 
example showed one implementation of such an expression 
and there will be more examples later on. 

To create the final single SQL statement the following 
algorithm has to be performed: 

String sqlString-"WITH"; 

int first =0; 

for (i-0; i<size(v); i++) { 

sqlString=sqlString+v[i].getSqlString( ); 
(if first==l) 

sqlS tringosqlString+","; 
else 

first" 1; 



} 

sqlString = sqlString + " SELECT 

DISTINCT" <columns>FROM v[n].getName( ); 

The <columns> are the final result columns as defined in 
the original user query. The DISTINCT feature of SQL is 
used to eliminate duplicate rows which could have been 
computed in the intermediate common table expression. 
This is due to an SQL requirements that if a UNION needs 
to be performed, a UNION ALL needs to be done. Note that 
the method createOperatorExpression(Objectl, Object2, 
Operator) encapsulates the meaning of the Operator as 
defined by the application. Furthermore this method should 
have multiple signatures, as both Objects (1 and 2) could be 
Query Objects or CTE Objects, or one of them a Query 
Object and the other a CTE Object 

Here are a few examples of methods createOperatorEx- 
pression. They all create a new CTE Object which contains 
the appropriate SQL string as shown here: 

First the OR operator is examined having a standard 
definition 

createOperatorExpressioon(Query Objectl Query Object2, 
OR): 

SI (ID, START, STOP) AS ( 

SELECT ID, START, STOP FROM Tl, T2 WHERE 
PRODUCER-HITCHCOCK OR ACTOR-'Cary 
Grant') 

Note that the application uses an optimization technique in 
this case to avoid a UNION ALL in the final SQL query. 



createOperatorExpressioon(Query Objectl, CTE Objectc, 
OR): 

5 SI (ID, START, STOP) AS ( 

SELECT ID, START, STOP FROM Tl WHERE 

PRODUCER—HITCHCOCK 
UNION ALL 

SELECT ID, START, STOP FROM Objectc) 



10 



Note that the "select" string is obtained by calling a method 
on Objectl and on Objectc 

createOperatorExpression(CTE ctel, CTE cte2, OR) 
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SI (ID, START, STOP) AS ( 

SELECT ID, START, STOP FROM ctel 
UNION ALL 
20 SELECT ID, START, STOP FROM cte2 

Now the AND expression is shown. Here the AND 
expression has the same meaning as in the previous example 
and hence some computation needs to be performed 



createOperatorExpression(Query Objectl, Query Object2, 
AND) 

30 MAX function is implemented as 

CASE START 
WHEN S1.START>-S2.START THEN Sl.START 
ELSE S2.START 
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MIN function is implemented as 
CASE STOP 
WHEN Sl.STOP>-S2.STOP THEN S2.STOP 
ELSE S2.STOP 

Need to define how SI and S2 are determined 



51 (ID, START, STOP) AS 
(SELECT (ID, START, STOP) FROM Tl WHERE 

PRODUCER-' HITCHCOCK'), 

52 (ID, START, STOP) AS 
(SELECT (ID, START, STOP) FROM T2 WHERE 

ACTOR-'Cary Grant'), 

53 (ID, START, STOP) AS 

(SELECT ID, MAXfSl.START, S2.START), MIN 
(Sl.STOP, S2.STOP) 
55 FROM SI, S2 WHERE S1.ID-S2.ID AND 

(Sl.START<S2.STOP) AND 
(S2.START<Sl.STOP)) 

createOperatorExpression(Query Object 1, CTE SI, AND) 

60 

S2 (ID, START, STOP) AS 

(SELECT (ID, START, STOP) FROM Tl WHERE 
PRODUCER-HITCHCOCK), 
65 S3 (ID, START, STOP) AS ( 

SELECT ID, MAX(S1.START, S2. START), MIN 
(Sl.STOP, S2.STOP) 
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FROM SI, S2 WHERE S1.ID-S2.ID AND 

(Sl.START<S2.STOP) AND 
(S2.START<S1.ST0P)) 
createOperatorExpression(CTE SI, CTE S2, AND) 

5 

S3 (ID, START, STOP) AS ( 

SELECT ID, MAX(S1.START, S2. START), MIN 
(Sl.STOP, S2.STOP) 
FROM SI, S2 

WHERE S1.1D-S2.ID AND io 
(Sl.START<S2.STOP) AND 
(S2.START<Sl.STOP)) 

5) A Boolean Compound Query Object 350 contains one 

or more methods on how translate the sub-queries queries 15 
into sub-queries queries which upon execution return result 
sets which can be combined with result sets returned from 
the execution of other sub-queries queries. 

An example in the section of key point 4 showed two 
sub-queries queries, one of which returns the columns ID, 20 
START, STOP, whereas the other returns the columns ID, 
START, STOP, RANK. To combine these the two result sets 
from the sub-queries queries, both should have the same 
result columns. Hence it is advisable to normalize the result 
set of the first sub-query and add a column RANK and put 25 
a zero value in there (or any other distinct value). This can 
be easily achieved by adding a RANK column in the SQL, 
query for query objects other than query objects which go 
against DB2 Extenders: 

The SQLL created to add a column of value 0 (or any 30 
other value) is like: 

SELECT ID, START, STOP, CASE WHEN START>0 THEN 0 
END AS RANK 

Note that when normalizing sub-queries queries other 35 
conditions need to be added on how to apply an operator 
between two rows which both have a RANK column. Again 
this can be done using a case statement. For example, for an 
AND operator the case could specify that the value of the 
RANK column should be 0 if both RANK columns have 0 40 
value in them, otherwise the bigger value. Any function on 
combining RANK can be implemented here. 

6) A Boolean Compound Query Object 350 may contain 
a set of resultcolumns which specify which additional col- 
umns are required for the final user query. 45 

In key point 1-6, it was shown how to construct a single 
SQL query, where the query specifications are expressed in 
an arbitrary Boolean expression. The constraint is that the 
result columns of all the sub-queries queries have to be the 
same — a fact that can be achieved using the "normalization" 50 
technique as shown in key point 5. However, another query 
can be appended which determines an arbitrary set of 
columns from another set of tables (in other words another 
fullselect in DB2 terms). 

The normalization method is based on U.S. Pat. No. 55 
5,873,080, entitled "Using Multiple Search Engines to 
Search Multimedia Data," assigned to the same assignee as 
the present invention. 

Example: Building on the examples given throughout, the 
set of common table expression returns ID, START, STOP, 60 
RANK in its columns. Suppose the user query requires a set 
of columns TITLE, NAME, ADDRESS which are stored in 
Tl and PHONE which is stored in T2 for each row which 
has been determined by using the common table expres- 
sions. Hence, tables Tl and T2 also contain columns ID, 65 
START, STOP in them and the additional information is 
required for rows whose ID matches an ID in the last 
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computed common table expression Tn and whose START 
and STOP times have a relation to the START and STOP 
time in Tn (e.g., contains, same). Such relations are denoted 
as function f<#> of the appropriate arguments. 
In that case the final SELECT of the expression would be 
SELECT TITLE, NAME, ADDRESS, PHONE from'Tl, 
T2 where 

ID.T1-ID.T2 AND ID.Tl-ID.Tn AND fl(Tl. START, 
12. START, Tn. START) AND 
f2(Tl.STOP, 12. STOP, Tn.STOP) 

FIG. 21 is a flowchart and detail illustrating a high 
performance execute method 965 to perform a free text 
query. In this preferred embodiment, IBM's DB2 is the 
underlying database, DB2's TextExtender is used to index 
the data and its primitives are used to perform execute 
method 965. 

An example of such a query is: Find all documents which 
contain the word IMPEACHMENT in the column TITLE 
and for these documents return the values which are in the 
DATE and PRODUCER columns. The word IMPEACH- 
MENT is the query expression 910, the column TITLE is 
specified in the query column 925 and the DATE and 
PRODUCER columns are the resultcolumns 930. 

More specifically the query column 925 is specified by 
atrip let <S1, Tl, Fl> where SI specifics the schema, Tl 
specifies the table name, and Fl the column name. (The 
terms schema, table name and column name are well defined 
within IBM's DB2.) The system however needs to know 
whether any given column specified by <S1, Tl, Fl> had 
been indexed by DB2 TextExtender, such that a free text 
query can be performed against it. If it has been indexed, 
access to the index can be obtained through a variable which 
represents such index. The following query returns such a 
variable if it exists and is performed against DB2 for every 
column <S1, Tl, Fl>. 
SELECT handle name 
FROM DB2TX.TEXTCOLUMNS 
WHERE TABLESCHEMA-'Sl' AND 
TABLENAME-'TT AND 
COLUMNNAME-'FT 

If the above specified query returns no rows, then no free 
text indexing has been performed and another execute 
method must be used. If a row is returned, the value in that 
row is the assigned handlename 920 representing the index 
for the column. If more than one row is returned, an error 
condition exists in the database. 

The process for determining handlenames 920 is expen- 
sive to perform during ad-hoc user sessions and thus the 
following performance enhancement is performed. The 
application has a hashtable HT which is seeded prior to the 
execution of any user ad-hoc queries since generally the set 
up for DB2's TextExtender indexing is performed only once 
during the database creation and the handlenames do not 
change once they are set up. The keys to the hashtablcs are 
the column names within the underlying DB2 database 
which have been enabled for DB2 TextExtender indexing (as 
determined by the system administrator). The value 
recorded in hashtable HT is a string representing the handle- 
name 920 for that column. If a value exists for a particular 
key column, then that column is considered by the system to 
be enabled for text indexing using the handlename value. If 
no value exists for a column name, then no such indexing is 
presumed. 

FIGS. 22 and 23 show two different methods for creating 
this hashtable HT. This hashtable HT is introduced for 
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performance reasons because it is generally much quicker to 
perform a hashtable lookup than a database query to deter- 
mine the same information multiple times. The form or 
algorithm of the hashtable itself is not germaine to the 
discussion of this disclosure but it is assumed to be a well 5 
performing hashtable. 

The input to the execute method 965 is a set of query 
expressions and query objects. Details about the preparation 
of the input parameters can be found in U.S. patent appli- 
cation by Brereton, Coden and Schwartz: Method and Sys- to 
tem for Translating an Ad-Hoc Query to Structured Query 
Language Using Common Table Expressions which is incor- 
porated above. 

In summary, the input to this execute method is a stack. 
An element on the stack ST is either a fourtuplet <Cj, Sj, Tj, 15 
Fj> where Cj is a query expression 910 and Sj, Tj and Fj 
represent a query column 925 or an Operator Query Object 
270 and is shown in box 2100 of FIG. 2100. Furthermore, 
there is a counter (integer) I which is initialized to 0 and an 
empty stack CTE_STACK. 20 

In box 2110 it is checked whether there are any elements 
left on the stack ST. If ST is empty, the process continues to 
box 2180 where the CTE_STACK is processed. This pro- 
cess is described in FIG. 20 starting with box 2068. 

In case there are objects left on ST, the top object O is 25 
popped in 2120. In 2130 it is examined whether O is an 
operator query object In case it is not, the process proceeds 
to 2135 where the values Sj, Tj, Fj are examined to deter- 
mine whether the query column 925 was indexed for DB2 
TextExtenders by a lookup into the hashtable HT. If the 30 
column had not been indexed this execute method termi- 
nates and a different execute method needs to be invoked. 
Otherwise, in box 2138 it is determined which handlename 
920 is associated with the query column 925. Towards this 
end the hashtable entry retrieved from step 2135 is exam- 35 
ined. The string denoting the index is stored into a variable 
handlename 920. The process then continues to box 2140 
which is described in more detail in a subsequent paragraph. 
However, in this box a new Common Table Expression 
Query Object (FIG. 18) is created which contains a sophis- 40 
ticated string sqlString 1820 which is described in more 
detail in a subsequent paragraph. 

In case the Object 0 in box 2130 is an Operator Query 
Object, the process proceeds to box 2150, where also a new 
Common Table Expression Query Object (FIG. 18) is ere- 45 
ated. Again it contains a sophisticated sqlString 1820 which 
is described in more detail in a subsequent paragraph. 

The process then proceeds to box 2160 where the counter 
I is incremented by one and then to box 2170 where the CTE sq 
Object created in either box 2140 or 2150 is pushed onto the 
CTE_STACK. 

The sqlString created for the CTE Object in box 2138 will 
now be described. It is based on a template derived from the 
following example provided in the IBM manual, DB2 Text $s 
Extender Administration and Programming Guide which is 
repeated here for completeness. 

In this example, a column with the name TITLE is 
searched for the word "IMPEACHMENT. The specifica- 
tion for the column contains the tableschema DB2TX and 
the tablename SAMPLE. The query is further constrained by 
the query condition 915 that the resulting documents are 
from the year 1995. 
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WITH REPHANDLE(MYDOC HANDLE) AS (SELECT 
DB7TXX)B2TEXTH(prototypehandle) FROM 
DB2TX.TEXTCOLUMNS 
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(1) 



WHERE TABLESCHEMA-' DB2TX' AND 
TAB LENAME-' SAMPLE' AND 
C O LU MNN AME= ' TITLE ' , 

ROWRESULTUST( RESULTDOCLIST) AS (SELECT 

DB2TX.HANDLE_liST(MYDOCHAKDLE, "'IMPEACH- 
MENT") (2) 

FROM REPHANDLE), 

MATCHTAELE( handle, RESULTDOCUST, cardinality, number) 
AS (SELECT db2U.bandJe(RESUlTDOCLIST,l), (3) 

RESULTDOCUST, 

db2tx.no_of_documents(RESULTDOCLIST), 
1 

FROM ROWRESULTLIST 

WHERE db2tx.no_of_doaiments(RESULTDOCUST) 
>0 

UNION ALL SELECT db2tx.hand!e(RESULTDOCLIST, number* 
1). (4) 

RESULTDOCLIST, 

cardinality, , 
number+1 

FROM MATCHTABLE 
WHERE number<cardinality) 

SELECT comment FROM db2tJLsampIe, MATCHTABLE (5) 

WHERE year(date)-1995 AND 
commenthandle=HANDLE; 

The query above is a very specific example of how one 
can use a recursive query, a specific handle name and the 
handle lists provided by DB2 TcxtExtcndcr to quickly query 
an index. 

In particular (1) creates a temporary table REPHANDLE 
which has a single column MYDOCHANDLE. The value in 
this column is prototypehandle which specifies the DB2 
handle for the TITLE column which is further defined by the 
TABLESCHEMA and TABLENAME. 

In (2) a temporary table ROWRESULTLIST is created 
which has a single column RESULTDOCLIST. The value in 
this column is a pointer to the result list of documents which 
contain the word compress in the TITLE column. The result 
list is in the form of a pointer to a list of handles which point 
to the documents themselves. 

Sections (3) and (4) are the recursive part to process the 
handlelist. In particular (3) is the initial sub-query which 
returns the first relevant handle. The RESULTDOCLIST 
column and the number of documents in the handle list is the 
same in all the rows but are used to assure that the recursion 
stops. The second part (4) is called the recursive sub-query 
that adds more rows to the temporary table MATCHTABLE 
based on the rows which are already there. Note that each 
time the recursive sub-query (4) is executed it sees only the 
rows that were added by the previous iteration. 

lo (5) the final result is obtained which contains the 
TITLE column for all the documents from 1995 and which 
contains the word IMPEACHMENT. 

This method has proven to work quite quickly and effec- 
tively for specific queries on a single DB2 TextExtender 
index. However, it does not describe how to query multiple 
indeces (using multiple handlenames). The process 
described in FIG. 21 shows how to generalize the above 
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template to accommodate simultaneous searches for mul- 
tiple handlenames 920 representing multiple query columns 
925. 

An example of a query involving multiple indices would 
be of the form: Find all documents which have the word 
IMPEACHM ENT in the TITLE column or the word NIXON 
in the COMMENT column and for these documents return 
the values which are in the DATE and PRODUCER col- 
umns. The words IMPEACHMENT and NIXON are the 
query expression 910, the column TITLE and COMMENT 
are specified in the query column 925 and the DATE and 
PRODUCER columns are the resultcolumns 930. 

The sqiString 1820 created in 2140 is of the following 
form. Note that I is the value of the counter transformed into 
a string. It assures that all the sqlStrings in all the CTE 
Objects created in box 2140 have unique table names. 
REPHANDLEI(MYDOCHANDLE) AS 
(SELECT DB2TX.DB2TEXTH(prototypehandle) 
FROM DB2TX.TEXTCOLUMNS 
WHERE TABLESCHEMA='Sj' AND 
TABLENAME-'Tj'AND 
COLUMNNAME-'Fj' 

X 

ROWRESULTLISTI(RESULTDOCLIST) AS 
(SELECT DB2TX.H ANDLE_LIST 

(MYDOCHANDLE, "'Cj'") 

FROM REPHANDLEI 

)> 

MATCHTABLEI (handle, RESULTDOCLIST, 

cardinality, number) AS 
SELECT db2tx.handle(RESULTDOCLIST,l), 

RESULTDOCLIST, 

db2tx.no_of_documents(RESULTDOCLIST), 
1 

FROM ROWRESULTLISTI 

WHERE db2tx.no_of_documents 
(RESULTDOCUST)>0 
UNION ALL 

SELECT db2tx.handle(RESULTDOCUST,numbcr+l) > 
RESULTDOCLIST, 
cardinality, 
number+1 

FROM MATCHTABLEI 
WHERE number<cardinality 

) 

The sqiString created in box 2150 is described now. In 
particular, the strings are shown for the operators AND and 
OR, however, other operators can be easily accommodated 
here. 

First the sqiString 1820 for the AND operator is 
described. Note that the value I is the last counter value as 
determined in box 2160. The columns Kl through Kn are 
one set of query conditions 915. 

QI (Kl, . . . Kn) AS 

(SELECT Kl Kn) FROM Q(I-l), Q(l-2) 

WHERE Q(I-1).K1-Q(I-2).K1 
AND Q(I-1).K2-Q(I-2).K2 

AND Q(I-l).Kn-Q(I-2).Kn 

First the sqiString 1820 for the OR operator is described. 
Note that the value I is the last counter value as determined 
in box 2160. 



to 



QI (Kl, . . . Kn) AS 

(SELECT Kl, . . Kn) FROM Q(I-1), Q(I-2) 
WHERE Q(I-1).KUQ(I-2).K1 
UNION ALL 
(SELECT Kl, . . . Kn) FROM Q(I-1), 0(1-2) 
WHERE Q(I-1).K2-Q(I-2).K2 

UNION ALL 
(SELECT Kl, . . . Kn) FROM Q(I-l), Q(I-2) 
WHERE Q(I-l).Kn=Q(I-2).Kn 
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FIG. 22 The hashtable HTcan be loaded from a properties 
file (such as Java's "java.utiLProperties" object class, Win- 
dows 95/98 registry or other form of data initialization) 
which the system administrator prepares at the time be or she 
performs the necessary "enable text column" commands for 
DB2 TexlExtender indices. For each "enable text tablename 
column columnname title handle handlename** command 
executed (where a table is represented as S1.T1 and col- 
umnname is represented as Fl), an entry is made into a 
20 properties file that contains the fully qualified column name 
and the handlename assigned from the command. Only DB2 
TextExtender enabled columns would receive an entry. 
Columns without a corresponding hashtable entry would be 
presumed disabled for DB2 TextExtender indexing. The 
25 properties file is then loaded into the hashtable HT using the 
prescribed means (e.g. java.util.Properties.load( ) function 
or regedit in Windows 95/98). 

This method for the hashtable creation is more formerly 
described in FIG. 22 which begins in box 2210 with the list 
of "enable text" commands. The method then proceeds to 
box 2220 where it is determined if there are any more 
"enable text" commands to process. If there are no more to 
process, the method terminates (box 2250). If there are 
more, the method proceeds to box 2230. In 2230, an entry 
of the form Sj.Tj.Fj-handlename is placed into hashtable 
HT. The method proceeds to box 2240 where we move on 
to the next "enable text" command. The method returns to 
box 2220. 

FIG. 23 An alternative method for hashtable creation is 
described in this figure. This method is useful if there are a 
40 large or unknown number of DB2 TextExtender indexed 
columns. It begins in box 2310 with a list of columns of the 
form Sj.Tj.Fj which the system administrator has determined 
to be "queryable". That is, columns on which the users are 
allowed to form queries. The method then proceeds to box 
45 2320 where it is determined if there are any more "query" 
columns to process. If there are no more to process, the 
method terminates (box 2350). If there are more, the method 
proceeds to box 2330. In 2330, the following query is 
submitted to DB2: 
SELECT handlename 
FROM DB2TX.TEXTCOLUMNS 
WHERE TABLESCHEMA-'Sj* AND 
TABLENAME-'Tj* AND 
COLUMNNAME-'Fj' 

55 In box 2340, if the query returns no rows, the method 
proceeds to box 2360. If the query returns more than one 
row, an error condition exists and the process terminates 
(box 2380). If one row is returned, then the contents of that 
row is the handlename and the method proceeds to box 2350 

60 where an entry of the form Sj.Tj.Fj -handlename is placed 
into hashtable HT. The method proceeds to box 2360. In 
2360, the method moves on to the next "queryable" column. 
The method returns to box 2320. 
We claim: 

65 1. A computer system having one or more central pro- 
cessing units, one or more memories, and one or more 
databases, the computer system further comprising: 
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one or more query objects created by a client process, the 
query objects having one or more sub-query objects 
and one or more execute methods that are capable of 
operating on their respective query object to produce 
one or more query expressions, all of the execute 
methods capable of producing the respective query 
expression that is compatible with a structured query 
language; and 

a compound query containing one or more boolean 
expressions of one or more of the query objects, the 
compound query having one or more compound 
execute methods which invoke one or more of the 
execute methods of each of the query objects, each of 
the execute methods returning their respective query 
expression, the compound execute method using one or 
more common table expressions to combine the query 
expressions to form a single compound query expres- 
sion that represents the boolean expression and can be 
executed against a database to return a result without 
executing any of the query expressions against the 
database individually. 

2. A system as in claim 1, where the query objects are any 
one or more of the following types: a Compound Free Text 
Query Object, a Compound Boolean Text Query Object, a 
Compound Parametric Query Object, a Compound Feature 
Query Object and a Boolean Compound Query Object. 

3. A system, as in claim 1, where one or more of the query 
objects is a Compound Free Text Query Object, each Com- 
pund Free Text Query Object having one or more text atoms, 
and the compound execute method creating one or more 
common table expressions, one or more of common table 
expression corresponding to one of the text atoms. 

4. A system, as in claim 3, where one or more of the 
common table expressions corresponds to one or more 
combinations of the text atoms. 

5. A system, as in claim 1, where the results are ranked by 
a ranking process. 

6. A computer system having one or more central pro- 
cessing units, one or more memories, and one or more 
databases, the computer system further comprising: 

a means for creating one or more query objects, the query 
objects having one or more sub-query objects and one 
or more execute methods that are capable of operating 
on their respective query to produce one or more query 



producing the respective query expression that is com- 
patible with a structured query language; and 
invoking one or more of the execute methods of each of 
the query objects by using one or more compound 

5 execute methods in a compound query, the compound 
query containing one or more boolean expressions of 
one or more of the query objects, each of the execute 
methods returning their respective query expression, 
the compound execute method using one or more 

10 common table expressions to combine the query 
expressions to form a single compound query expres- 
sion that represents the boolean expression and 
executes against a database to return a result without 
executing any of the query expressions against the 

!5 database individually. 

8. A computer program product which performs the steps 
of: 

creating one or more query objects, the query objects 
having one or more sub-query objects and one or more 

20 execute methods that are capable of operating on their 
respective query object to produce one or more query 
expressions, all of the execute methods capable of 
producing the respective query expression that is com- 
patible with a structured query language; and 

25 invoking one or more of the execute methods of each of 
the query objects by using one or more compound 
execute methods in a compound query, the compound 
query containing one or more boolean expressions of 
one or more of the query objects, each of the execute 

30 methods returning their respective query expression, 
the compound execute method using one or more 
common table expressions to combine the query 
expressions to form a single compound query expres- 
sion that represents the boolean expression and 

35 executes against a database to return a result without 
executing any of the query expressions against the 
database individually. 

9. The computer system of claim 1, further comprising a 
step to submit the single compound query expression to the 

40 database for execution. 

10. The computer system of claim 1, wherein each of two 
of the query expressions define a result set, and wherein the 
compound execute method normalizes one of the result sets. 

11. The computer system of claim 10, wherein one result 



expressions, all of the execute methods capable of 45 set comprises a first number of columns that is less than a 



producing the respective query expression that is com- 
patible with a structured query language; and 
means for invoking one or more of the execute methods 
of each of the query objects by using one or more 
compound execute methods in a compound query, the so 
compound query containing one or more boolean 
expressions of one or more of the query objects, each 
of the execute methods returning their respective query 
expression, the compound execute method using one or 
more common table expressions to combine the query 55 
expressions to form a single compound query expres- 
sion that represents the boolean expression and 
executes against a database to return a result without 
executing any of the query expressions against the 
database individually. 50 
7. A method executing on a computer system comprising 
the steps of: 

creating one or more query objects, the query objects 
having one or more sub -query objects and one or more 
execute methods that are capable of operating on their 65 
respective query object to produce one or more query 
expressions, all of the execute methods capable of 



second number of columns defined by the other of the two 
result sets, and wherein the compound execute method 
normalizes the one result set by adding an appropriate 
number of columns to the one result set so that the number 
of columns of the one result set after the addition of columns 
is equivalent to the second number of columns. 

12. The computer system of claim 6, further comprising 
a means for submitting the single compound query expres- 
sion to the database for execution. 

13. The method of claim 7, further comprising a step of 
submitting the single compound query expression to the 
database for execution. 

14. The method of claim 7, wherein each of two of the 
query expressions define a result set, and wherein the 
method further comprises the step of the compound execute 
method normalizing one of the result sets. 

15. The method of claim 14, wherein one result set 
comprises a first number of columns that is less than a 
second number of columns defined by the other of the two 
result sets, and wherein the step of the compound execute 
method normalizing one of the result sets further comprises 
the step of normalizing the one result set by adding an 
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appropriate number of columns to the one result set so that 
the number of columns of the one result set after the addition 
of columns is equivalent to the second number of columns. 

16. The computer program product of claim 8, further 
comprising a step of submitting the single compound query 5 
expression to the database for execution. 

17. The computer program product of claim 8, wherein 
each of two of the query expressions define a result set, and 
wherein the computer program product further comprises a 
step of the compound execute method normalizing one of 10 
the result sets. 



50 

18. The computer program product of claim 17, wherein 
one result set comprises a first number of columns that is less 
than a second number of columns defined by the other of the 
two result sets, and wherein the step of the compound 
execute method normalizing one of the result sets further 
comprises the step of normalizing the one result set by 
adding an appropriate number of columns to the one result 
set so that the number of columns of the one result set after 
the addition of columns is equivalent to the second number 
of columns. 

***** 
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